Chemistry Metadata
When you enable the Embed chemistry metadata export option, Molkit stamps a machine-readable description of the molecule into the SVG itself: scalar attributes on the root element, a full set of data-mk-* attributes on every atom group and bond group, and a structured mk:chemistry block inside the file’s metadata element. Your page script can read the formula, walk the bond graph, or hand the embedded MOL block to a cheminformatics library, all without a server round trip. This page is the schema reference.
Root attributes
Scalar attributes on the svg root give you the headline identity of the molecule without touching the DOM tree below it. They are deliberately duplicated outside the metadata block so they survive HTML sanitizers that strip metadata elements.
| Attribute | Type | When present | Description |
|---|---|---|---|
data-mk-version | string | always | Schema version, currently 1.0 |
data-mk-formula | string | single molecule on canvas | Molecular formula, e.g. C6H6 |
data-mk-mw | number string | single molecule | Molecular weight in g/mol, rounded to three decimals |
data-mk-smiles | string | single molecule, SMILES generation succeeded | SMILES string |
data-mk-canonical-smiles | string | when RDKit canonicalization ran | Canonical SMILES |
data-mk-inchi | string | PubChem enrichment succeeded | InChI identifier |
data-mk-inchi-key | string | PubChem enrichment succeeded | 27-character InChIKey |
data-mk-cid | number string | PubChem enrichment succeeded | PubChem compound ID |
The last three come from an online lookup at export time (see PubChem integration). If the export happened offline, or PubChem did not recognize the structure, those attributes are simply absent. Guard for null.
Per-atom attributes
Every atom group (the same element that carries the atom’s data-id) gets its own attribute set. Query atoms with [data-mk-element].
| Attribute | Type | When present | Description |
|---|---|---|---|
data-mk-element | string | always | Element symbol, e.g. N |
data-mk-x | number | always | Canvas X coordinate |
data-mk-y | number | always | Canvas Y coordinate (SVG space, Y grows downward) |
data-mk-index | integer | always | Zero-based atom index, matches MOL block order |
data-mk-formal-charge | integer | always | Formal charge, can be 0 or negative |
data-mk-lone-pairs | integer | always | Lone pair count |
data-mk-domains-bonding | integer | always | Bonding electron domains (VSEPR) |
data-mk-domains-nonbonding | integer | always | Nonbonding electron domains (VSEPR) |
data-mk-implicit-h | integer | always | Implicit hydrogen count; reads 0 when the omit option was checked at export |
data-mk-octet | enum | always | satisfied, deficient, or expanded |
data-mk-valence-electrons | integer | always | Valence electrons contributed by this atom |
data-mk-warnings | JSON array | always | Validator warning types; [] when clean |
data-mk-hybridization | string | when determinable | e.g. sp2 |
data-mk-geometry | string | when geometry is known | VSEPR geometry name, e.g. trigonal planar |
data-mk-vsepr-angles | JSON array | with data-mk-geometry | Ideal angles in degrees, e.g. [120] |
data-mk-chiral | "true" | potential stereocenter detected | Four distinct substituents (heuristic, not full CIP) |
data-mk-isotope | integer | isotope label set | Mass number |
data-mk-aromatic | "true" | atom in an aromatic ring | Aromaticity flag |
data-mk-ring-sizes | JSON array | atom in at least one ring | Sizes of rings containing this atom, e.g. [6] |
data-mk-oxidation-state | integer | when computable | Oxidation state |
data-mk-abbreviation | string | atom drawn as an abbreviation | Custom label, e.g. OEt |
Per-bond attributes
Bond groups carry the graph topology plus rendered geometry. Query bonds with [data-mk-order].
| Attribute | Type | When present | Description |
|---|---|---|---|
data-mk-order | number | always | Bond order: 1, 2, 3, or 1.5 |
data-mk-atom1 | string | always | data-id of the first atom |
data-mk-atom2 | string | always | data-id of the second atom |
data-mk-x1 / data-mk-y1 | number | when geometry extracted | Rendered start point of the bond line |
data-mk-x2 / data-mk-y2 | number | when geometry extracted | Rendered end point of the bond line |
data-mk-stereo | enum | non-plain bonds only | Bond style: wedge, dash, wavy, aromatic, dative, and other drawn styles |
data-mk-polarity | number | electronegativity difference at or above 0.05 | Pauling EN difference, two decimals |
data-mk-polar-toward | string | with data-mk-polarity | Element symbol the dipole points toward |
data-mk-conjugated | "true" | bond in a conjugated system | Conjugation flag |
data-mk-resonance-order | number | aromatic bonds | Effective order, 1.5 |
The metadata element
Deeper data lives in the mk:chemistry element inside the SVG’s metadata element (namespace https://molkit.app/ns/chemistry/1.0). For a single-molecule canvas it contains the formula (plain and HTML), molecular weight and exact mass, SMILES, atom and bond counts, total valence electrons, degree of unsaturation, an element-composition JSON object, the point group with a confidence rating, a functional-groups JSON array, the full MOL block, and a Schema.org JSON-LD object. PubChem enrichment appends the IUPAC name, compound ID, InChI, InChIKey, and common names when the lookup succeeds.
The JSON-LD payload is a MolecularEntity, taken here from a reference export of the nitrate ion (the SMILES string reflects whichever generator produced the export; RDKit-backed exports carry canonical SMILES):
{ "@context": "https://schema.org", "@type": "MolecularEntity", "name": "NO3", "molecularFormula": "NO3", "molecularWeight": "62.004", "smiles": "N=O(O)(O)", "hasRepresentation": { "@type": "ImageObject", "encodingFormat": "image/svg+xml", "creator": { "@type": "SoftwareApplication", "name": "Molkit", "url": "https://molkit.com" } }}Because the SVG ships its own structured-data block, search engines that index inlined SVG can associate the image with a chemical identity rather than treating it as anonymous vector art.
Extracting the MOL block
The mk:mol-block element holds a complete V2000 molfile as text content, ready to feed to RDKit.js, OpenBabel, or any MOL-aware tool. One caveat: Y coordinates are flipped relative to the SVG (chemistry convention is Y-up, SVG is Y-down). Match metadata children by localName, which ignores the namespace prefix and works in every browser:
function getMolBlock(svg) { const meta = svg.querySelector('metadata'); if (!meta) return null; const chem = [...meta.children].find(el => el.localName === 'chemistry'); const mol = chem && [...chem.children].find(el => el.localName === 'mol-block'); return mol ? mol.textContent : null;}Reading the molecule graph
The atom and bond attributes together form a complete graph. Here is a self-contained parser that builds an adjacency structure you can use for highlighting, tooltips, or analysis. It works against an inlined export (see embedding for how to get the SVG into your page):
function parseGraph(svg) { const atoms = new Map();
for (const g of svg.querySelectorAll('[data-mk-element]')) { const id = g.getAttribute('data-id'); atoms.set(id, { id, element: g.getAttribute('data-mk-element'), x: parseFloat(g.getAttribute('data-mk-x')), y: parseFloat(g.getAttribute('data-mk-y')), charge: parseInt(g.getAttribute('data-mk-formal-charge'), 10), lonePairs: parseInt(g.getAttribute('data-mk-lone-pairs'), 10), implicitH: parseInt(g.getAttribute('data-mk-implicit-h'), 10), hybridization: g.getAttribute('data-mk-hybridization'), // null when absent ringSizes: JSON.parse(g.getAttribute('data-mk-ring-sizes') ?? '[]'), neighbors: [], node: g, // keep the live element for styling }); }
const bonds = []; for (const g of svg.querySelectorAll('[data-mk-order]')) { const bond = { order: parseFloat(g.getAttribute('data-mk-order')), atom1: g.getAttribute('data-mk-atom1'), atom2: g.getAttribute('data-mk-atom2'), stereo: g.getAttribute('data-mk-stereo'), // null for plain bonds node: g, }; bonds.push(bond); atoms.get(bond.atom1)?.neighbors.push(bond.atom2); atoms.get(bond.atom2)?.neighbors.push(bond.atom1); }
return { atoms, bonds };}For example, parseGraph(svg).atoms.values() over the nitrate export yields one N with charge: 1 and three O atoms, and the three bonds connect them through the central nitrogen.
Gotchas
- Conditional attributes need null-guards.
getAttributereturns null for absent attributes likedata-mk-stereo,data-mk-hybridization, ordata-mk-isotope. Decide on a default before parsing. - Never use logical-or fallbacks on numeric reads. Zero is a valid value for
data-mk-formal-charge,data-mk-lone-pairs,data-mk-implicit-h, anddata-mk-oxidation-state. Use??after parsing, not||. - JSON-array attributes need JSON.parse.
data-mk-vsepr-angles,data-mk-ring-sizes, anddata-mk-warningsare JSON text, not plain strings. - Molecule-level fields disappear on multi-molecule canvases.
data-mk-formula,data-mk-mw,data-mk-smiles, and the single-molecule children of themk:chemistryelement are only written when the canvas held exactly one connected structure. Per-atom and per-bond attributes are still present. - Atom IDs are opaque strings. Do not parse them. Match bonds to atoms by comparing
data-mk-atom1anddata-mk-atom2against each atom group’sdata-id, as the parser above does. - PubChem fields are best-effort.
data-mk-inchi,data-mk-inchi-key,data-mk-cid, and the IUPAC name exist only when the export-time lookup succeeded online. - The omit-implicit-H export option only blanks one attribute. With it checked, every
data-mk-implicit-hreads 0. Nothing else changes: formula, molecular weight, atom count, and total valence electrons always include implicit hydrogens so they agree with the SMILES and PubChem identity fields. - Re-export files made with older builds. Earlier exporters had implicit-hydrogen accounting bugs: line-structure carbons could carry
data-mk-formal-charge="1",data-mk-octet="deficient", a phantom lone pair, and wrong hybridization, and molecule-level numbers could exclude implicit H while the formula included them. Current exports are internally consistent; treat old files as suspect and re-export.