MathJSON
MathJSON: a lightweight data interchange format for mathematical notation.
Math | MathJSON |
---|---|
\displaystyle\frac{n}{1+n} | ["Divide", "n", ["Add", 1, "n"]] |
\sin^{-1}^\prime(x) | ["Apply", ["Derivative", ["InverseFunction", "Sin"]], "x"] |
MathJSON is built on the JSON format. Its focus is on interoperability between software programs to facilitate the exchange of mathematical data and the building of scientific software through the integration of software components communicating with a common format.
It is human-readable, while being easy for machines to generate and parse. It is simple enough that it can be generated, consumed and manipulated using any programming languages.
MathJSON can be transformed from (parsing) and to (serialization) other formats.
Type an expression in the mathfield below to see its MathJSON representation.
The Cortex Compute Engine library provides an implementation in JavaScript/TypeScript of utilities that parse LaTeX to MathJSON, serialize MathJSON to LaTeX, and provide a collection of functions for symbolic manipulation and numeric evaluations of MathJSON expressions.
Mathematical notation is used in a broad array of fields, from elementary school arithmetic, engineering, applied mathematics to physics and more. New notations are invented regularly and MathJSON endeavors to be flexible and extensible to account for those notations.
The Compute Engine includes a standard library of functions and symbols which can be extended with custom libraries.
MathJSON is not intended to be suitable as a visual representation of arbitrary mathematical notations, and as such is not a replacement for LaTeX or MathML.
Structure of a MathJSON Expressionβ
A MathJSON expression is a combination of numbers, symbols, strings and functions.
Number
3.14
314e-2
{"num": "3.14159265358979323846264338327950288419716939937510"}
{"num": "-Infinity"}
Symbol
"x"
"Pi"
"π"
"εεΎ"
{"sym": "Pi", "wikidata": "Q167"}
String
"'Diameter of a circle'"
{"str": "Srinivasa Ramanujan"}
Function
["Add", 1, "x"]
{"fn": [{"sym": "Add"}, {"num": "1"}, {"sym": "x"}]}
Numbers, symbols, strings and functions are expressed either as
object literals with a "num"
"str"
"sym"
or "fn"
key, respectively, or
using a shorthand notation as a a JSON number, string or array.
The shorthand notation is more concise and easier to read, but it cannot include metadata properties.
Numbersβ
A MathJSON number is either:
- an object literal with a
"num"
key - a JSON number
- a JSON string starting with
+
,-
or the digits0
-9
. Using a string is useful to represent numbers with a higher precision or greater range than JSON numbers.
Numbers as Object Literalsβ
Numbers may be represented as an object literal with a "num"
key. The
value of the key is a string representation of the number.
{
"num": <string>
}
The string representing a number follows the JSON syntax for number, with the following differences:
- The range or precision of MathJSON numbers may be greater than the range and precision supported by IEEE 754 64-bit float.
{ "num": "1.1238976755823478721365872345683247563245876e-4567" }
- The string values
"NaN"
"+Infinity"
and"-Infinity"
are used to represent respectively an undefined result as per IEEE 754,+\infty
, and-\infty
.
{ "num": "+Infinity" }
- If the string includes the pattern
/\([0-9]+\)/
, that is a series of one or more digits enclosed in parentheses, that pattern is interpreted as repeating digits.
{ "num": "1.(3)" }
{ "num": "0.(142857)" }
{ "num": "0.(142857)e7" }
- The following characters in a string representing a number are ignored:
U+0009 | TAB |
U+000A | LINE FEED |
U+000B | VERTICAL TAB |
U+000C | FORM FEED |
U+000D | CARRIAGE RETURN |
U+0020 | SPACE |
U+00A0 | UNBREAKABLE SPACE |
Numbers as Number Literalsβ
When a number is compatible with the JSON representation of numbers and has no metadata, a JSON number literal may be used.
Specifically:
- the number fits in a 64-bit binary floating point, as per IEEE 754-2008, with a
53-bit significand (about 15 digits of precision) and 11-bit exponent.
If negative, its range is from
-1.797693134862315 \cdot 10^{+308}
to-2.225073858507201\cdot 10^{-308}
and if positive from2.225073858507201\cdot 10^{-308}
to1.797693134862315\cdot 10^{+308}
- the number is finite: it is not
+Infinity
-Infinity
orNaN
.
0
-234.534e-46
The numeric values below may not be represented as JSON number literals:
// Exponent out of bounds
{ "num": "5.78e400" }
// Too many digits
{ "num": "3.14159265358979323846264338327950288419716" }
// Non-finite numeric value
{ "num": "-Infinity" }
Numbers as String Literalsβ
An alternate representation of a number with no extra metadata is as a string following the format described above.
This allows for a shorthand representation of numbers with a higher precision or greater range than JSON numbers.
"3.14159265358979323846264338327950288419716"
"+Infinity"
Stringsβ
A MathJSON string is either:
- an object literal with a
"str"
key - a JSON string that starts and
ends with U+0027
'
APOSTROPHE .
Strings may contain any character represented by a Unicode scalar value (a
codepoint in the [0...0x10FFFF]
range, except for [0xD800...0xDFFF]
), but
the following characters must be escaped as indicated:
Codepoint | Name | Escape Sequence |
---|---|---|
U+0000 to U+001F | \u0000 to \u001f | |
U+0008 | BACKSPACE | \b or \u0008 |
U+0009 | TAB | \t or \u0009 |
U+000A | LINE FEED | \n or \u000a |
U+000C | FORM FEED | \f or \u000c |
U+000D | CARRIAGE RETURN | \r or \u000d |
U+0027 | APOSTROPHE | \' or \u0027 |
U+005C | REVERSE SOLIDUS (backslash) | \\ or \u005c |
The encoding of the string follows the encoding of the JSON payload: UTF-8, UTF-16LE, UTF-16BE, etc...
"'Alan Turing'"
Functionsβ
A MathJSON function expression is either:
- an object literal with a
"fn"
key. - a JSON array
Function expressions in the context of MathJSON may be used to represent mathematical functions but are more generally used to represent the application of a function to some arguments.
The function expression ["Add", 2, 3]
applies the function named Add
to the
arguments 2
and 3
.
Functions as Object Literalβ
The default representation of function expressions is an object literal with
a "fn"
key. The value of the fn
key is an array representing the function
operator (its name) and its arguments (its operands).
{
"fn": [Operator, ...Operands[]]
}
For example:
2+x
:{ "fn": ["Add", 2, "x"] }
\sin(2x+\pi)
:{ "fn": ["Sin", ["Add", ["Multiply", 2, "x"], "Pi"]] }
x^2-3x+5
:{ "fn": ["Add", ["Power", "x", 2], ["Multiply", -3, "x"], 5] }
Functions as JSON Arraysβ
If a function expression has no extra metadata it may be represented as a JSON array.
For example these two expressions are equivalent:
{ "fn": ["Cos", ["Add", "x", 1]] }
["Cos", ["Add", "x", 1]]
An array representing a function must have at least one element, the operator of the
function. Therefore []
is not a valid expression.
Function Operatorβ
The operator of the function expression is the first element in the array. Its presence is required. It indicates the name of the function: this is what the function is about.
The operator is an identifier following the conventions for function names (see below).
// Apply the function "Sin" to the argument "x"
["Sin", "x"]
// Apply "Cos" to a function expression
["Cos", ["Divide", "Pi", 2]]
Following the operator are zero or more arguments (or operands), which are expressions.
The arguments of a function are expressions. To represent an
argument which is a list, use a ["List"]
expression, do not use a JSON array.
The expression corresponding to \sin^{-1}(x)
is:
["Apply", ["InverseFunction", "Sin"], "x"]
The operator of this expression is "Apply"
and its argument are the expressions
["InverseFunction", "Sin"]
and "x"
.
Shorthandsβ
The following shorthands are allowed:
- A
["Dictionary"]
expression may be represented as a string starting with U+007B{
LEFT CURLY BRACKET and ending with U+007D}
RIGHT CURLY BRACKET. The string must be a valid JSON object literal. - A
["List"]
expression may be represented as a string starting with U+005B[
LEFT SQUARE BRACKET and ending with U+005D]
RIGHT SQUARE BRACKET. The string must be a valid JSON array.
"{\"x\": 2, \"y\": 3}"
// β ["Dictionary", ["Tuple", "x", 2], ["Tuple", "y", 3]]
"[1, 2, 3]"
// β ["List", 1, 2, 3]
Symbolsβ
A MathJSON symbol is either:
- an object literal with a
"sym"
key - a JSON string
Symbols are identifiers that represent the name of variables, constants and wildcards.
Identifiersβ
Identifiers are JSON strings that represent the names of symbols, variables, constants, wildcards and functions.
Before they are used, JSON escape sequences (such as \u
sequences, \\
, etc.)
are decoded.
The identifiers are then normalized to the Unicode Normalization Form C (NFC). They are stored internally and compared using the Unicode NFC.
For example, these four JSON strings represent the same identifier:
"Γ "
"A\u030a"
U+0041Aβ
LATIN CAPITAL LETTER + U+030AΜ
COMBINING RING ABOVE"\u00c5"
U+00C5Γ
LATIN CAPITAL LETTER A WITH RING ABOVE"\u0041\u030a"
U+0041Aβ
LATIN CAPITAL LETTER A + U+030AΜ
COMBINING RING ABOVE
Identifiers conforms to a profile of UAX31-R1-1 with the following modifications:
- The character U+005F
_
LOW LINE is added to theStart
character set - The characters should belong to a recommended script
- An identifier can be a sequence of one or more emojis. Characters that have both the Emoji and XIDC property are only considered emojis when they are preceded with emoji modifiers. The definition below is based on Unicode TR51 but modified to exclude invalid identifiers.
Identifiers match either the NON_EMOJI_IDENTIFIER
or the EMOJI_IDENTIFIER
patterns below:
const NON_EMOJI_IDENTIFIER = /^[\p{XIDS}_]\p{XIDC}*$/u;
(from Unicode TR51)
or
const VS16 = "\\u{FE0F}"; // Variation Selector-16, forces emoji presentation
const KEYCAP = "\\u{20E3}"; // Combining Enclosing Keycap
const ZWJ = "\\u{200D}"; // Zero Width Joiner
const FLAG_SEQUENCE = "\\p{RI}\\p{RI}";
const TAG_MOD = `(?:[\\u{E0020}-\\u{E007E}]+\\u{E007F})`;
const EMOJI_MOD = `(?:\\p{EMod}|${VS16}${KEYCAP}?|${TAG_MOD})`;
const EMOJI_NOT_IDENTIFIER = `(?:(?=\\P{XIDC})\\p{Emoji})`;
const ZWJ_ELEMENT = `(?:${EMOJI_NOT_IDENTIFIER}${EMOJI_MOD}*|\\p{Emoji}${EMOJI_MOD}+|${FLAG_SEQUENCE})`;
const POSSIBLE_EMOJI = `(?:${ZWJ_ELEMENT})(${ZWJ}${ZWJ_ELEMENT})*`;
const EMOJI_IDENTIFIER = new RegExp(`^(?:${POSSIBLE_EMOJI})+$`, "u");
In summary, when using Latin characters, identifiers can start with a letter or an underscore, followed by zero or more letters, digits and underscores.
Carefully consider when to use non-latin characters. Use non-latin characters
for whole words, for example: "εεΎ"
(radius), "ΧΦ°ΧΦ΄ΧΧ¨ΧΦΌΧͺ"
(speed), "η΄εΎ"
(diameter) or "ΰ€Έΰ€€ΰ€Ή"
(surface).
Avoid mixing Unicode characters from different scripts in the same identifier.
Do not include bidi markers such as U+200E LTR
* or U+200F RTL
in
identifiers. LTR and RTL marks should be added as needed by the client
displaying the identifier. They should be ignored when parsing identifiers.
Avoid visual ambiguity issues that might arise with some Unicode characters. For example:
- prefer using
"gamma"
rather than U+0194Ι£
LATIN SMALL LETTER GAMMA or U+03B3Ξ³
GREEK SMALL LETTER GAMMA - prefer using
"Sum"
rather than U+2211β
N-ARY SUMMATION, which can be visually confused with U+03A3Ξ£
GREEK CAPITAL LETTER SIGMA.
The following naming convention for wildcards, variables, constants and function names are recommendations.
Wildcards Naming Conventionβ
Symbols that begin with U+005F _
LOW LINE (underscore) should be used to
denote wildcards and other placeholders.
For example, they may be used to denote the positional parameter in a function expression. They may also denote placeholders and captured expression in patterns.
Wildcard | |
---|---|
"_" | Wildcard for a single expression or for the first positional argument |
"_1" | Wildcard for a positional argument |
"_β_" | Wildcard for a sequence of 1 or more expression |
"___" | Wildcard for a sequence of 0 or more expression |
"_a" | Capturing an expression as a wildcard named a |
Variables Naming Conventionβ
-
If a variable is made of several words, use camelCase. For example
"newDeterminant"
-
Prefer clarity over brevity and avoid obscure abbreviations.
Use
"newDeterminant"
rather than"newDet"
or"nDet"
Constants Naming Conventionβ
- If using latin characters, the first character of a constant should be an
uppercase letter
A
-Z
- If a constant name is made up of several words, use camelCase. For example
"SpeedOfLight"
Function Names Naming Conventionβ
- The name of the functions in the MathJSON Standard Library starts with an
uppercase letter
A
-Z
. For example"Sin"
,"Fold"
. - The name of your own functions can start with a lowercase or uppercase letter.
- If a function name is made up of several words, use camelCase. For example
"InverseFunction"
LaTeX Rendering Conventionsβ
The following recommendations may be followed by clients displaying MathJSON identifiers with LaTeX, or parsing LaTeX to MathJSON identifiers.
These recommendations do not affect computation or manipulation of expressions following these conventions.
- An identifier may be composed of a main body, some modifiers, some style
variants, some subscripts and subscripts. For example:
"alpha_0__prime"
\alpha_0^\prime
"x_vec"
\vec{x}
"Re_fraktur"
\mathfrak{Re}
.
- Subscripts are indicated by an underscore
_
and superscripts by a double-underscore__
. There may be more than one superscript or subscripts, but they get concatenated. For example"a_b__c_q__p"
->a_{b, q}^{c, p}
\( a_{b, q}^{c, p} \). - Modifiers after a superscript or subscript apply to the closest preceding
superscript or subscript. For example
"a_b_prime"
->a_{b^{\prime}}
Modifiers include:
Modifier | LaTeX | |
---|---|---|
_deg | \degree | \( x\degree \) |
_prime | {}^\prime | \( x^{\prime} \) |
_dprime | {}^\doubleprime | \( x^{\doubleprime} \) |
_ring | \mathring{} | \( \mathring{x} \) |
_hat | \hat{} | \( \hat{x} \) |
_tilde | \tilde{} | \( \tilde{x} \) |
_vec | \vec{} | \( \vec{x} \) |
_bar | \overline{} | \( \overline{x} \) |
_underbar | \underline{} | \( \underline{x} \) |
_dot | \dot{} | \( \dot{x} \) |
_ddot | \ddot{} | \( \ddot{x} \) |
_tdot | \dddot{} | \( \dddot{x} \) |
_qdot | \ddddot{} | \( \dddodt{x} \) |
_operator | \operatorname{} | \( \operatorname{x} \) |
_upright | \mathrm{} | \( \mathrm{x} \) |
_italic | \mathit{} | \( \mathit{x} \) |
_bold | \mathbf{} | \( \mathbf{x} \) |
_doublestruck | \mathbb{} | \( \mathbb{x} \) |
_fraktur | \mathfrak{} | \( \mathfrak{x} \) |
_script | \mathscr{} | \( \mathscr{x} \) |
- The following common names, when they appear as the body or in a subscript/superscript of an identifier, may be replaced with a corresponding LaTeX command:
Common Names | LaTeX | |
---|---|---|
alpha | \alpha | \( \alpha \) |
beta | \beta | \( \beta \) |
gamma | \gamma | \( \gamma \) |
delta | \delta | \( \delta \) |
epsilon | \epsilon | \( \epsilon \) |
epsilonSymbol | \varepsilon | \( \varepsilon \) |
zeta | \zeta | \( \zeta \) |
eta | \eta | \( \eta \) |
theta | \theta | \( \theta \) |
thetaSymbol | \vartheta | \( \vartheta \) |
iota | \iota | \( \iota \) |
kappa | \kappa | \( \kappa \) |
kappaSymbol | \varkappa | \( \varkappa \) |
mu | \mu | \( \mu \) |
nu | \nu | \( \nu \) |
xi | \xi | \( \xi \) |
omicron | \omicron | \( \omicron \) |
piSymbol | \varpi | \( \varpi \) |
rho | \rho | \( \rho \) |
rhoSymbol | \varrho | \( \varrho \) |
sigma | \sigma | \( \sigma \) |
finalSigma | \varsigma | \( \varsigma \) |
tau | \tau | \( \tau \) |
phi | \phi | \( \phi \) |
phiLetter | \varphi | \( \varphi \) |
upsilon | \upsilon | \( \upsilon \) |
chi | \chi | \( \chi \) |
psi | \psi | \( \psi \) |
omega | \omega | \( \omega \) |
Alpha | \Alpha | \( \Alpha \) |
Beta | \Beta | \( \Beta \) |
Gamma | \Gamma | \( \Gamma \) |
Delta | \Delta | \( \Delta \) |
Epsilon | \Epsilon | \( \Epsilon \) |
Zeta | \Zeta | \( \Zeta \) |
Eta | \Eta | \( \Eta \) |
Theta | \Theta | \( \Theta \) |
Iota | \Iota | \( \Iota \) |
Kappa | \Kappa | \( \Kappa \) |
Lambda | \Lambda | \( \Lambda \) |
Mu | \Mu | \( \Mu \) |
Nu | \Nu | \( \Nu \) |
Xi | \Xi | \( \Xi \) |
Omicron | \Omicron | \( \Omicron \) |
Pi | \Pi | \( \Pi \) |
Rho | \Rho | \( \Rho \) |
Sigma | \Sigma | \( \Sigma \) |
Tau | \Tau | \( \Tau \) |
Phi | \Phi | \( \Phi \) |
Upsilon | \Upsilon | \( \Upsilon \) |
Chi | \Chi | \( \Chi \) |
Psi | \Psi | \( \Psi \) |
Omega | \Omega | \( \Omega \) |
digamma | \digamma | \( \digamma \) |
aleph | \aleph | \( \aleph \) |
lambda | \lambda | \( \lambda \) |
bet | \beth | \( \beth \) |
gimel | \gimel | \( \gimel \) |
dalet | \dalet | \( \dalet \) |
ell | \ell | \( \ell \) |
turnedCapitalF | \Finv | \( \Finv \) |
turnedCapitalG | \Game | \( \Game \) |
weierstrass | \wp | \( \wp \) |
eth | \eth | \( \eth \) |
invertedOhm | \mho | \( \mho \) |
hBar | \hbar | \( \hbar \) |
hSlash | \hslash | \( \hslash \) |
blacksquare | \hslash | \( \hslash \) |
bottom | \bot | \( \bot \) |
bullet | \bullet | \( \bullet \) |
circle | \circ | \( \circ \) |
diamond | \diamond | \( \diamond \) |
times | \times | \( \times \) |
top | \top | \( \top \) |
square | \square | \( \square \) |
star | \star | \( \star \) |
- The following names, when used as a subscript or superscript, may be replaced with a corresponding LaTeX command:
Subscript/Supscript | LaTeX | |
---|---|---|
plus | {}_{+} / {}^{+} | \( x_{+} x^+\) |
minus | {}_{-} /{}^{-} | \( x_{-} x^-\) |
pm | {}_\pm /{}^\pm | \( x_{\pm} x^\pm \) |
ast | {}_\ast /{}^\ast | \( {x}_\ast x^\ast \) |
dag | {}_\dag /{}^\dag | \( {x}_\dag x^\dag \) |
ddag | {}_\ddag {}^\ddag | \( {x}_\ddag x^\ddag \) |
hash | {}_\# {}^\# | \( {x}_# x^#\) |
-
Multi-letter identifiers may be rendered with a
\mathit{}
,\mathrm{}
or\operatorname{}
command. -
Identifier fragments ending in digits may be rendered with a corresponding subscript.
Identifier | LaTeX | |
---|---|---|
time | \mathrm{time} | \( \mathrm{time} \) |
speed_italic | \mathit{speed} | \( \mathit{speed} \) |
P_blackboard__plus | \mathbb{P}^{+} | \mathbb{P}^+ |
alpha | \alpha | \( \alpha \) |
mu0 | \mu_{0} | \( \mu_0 \) |
m56 | m_{56} | \( m_{56} \) |
c_max | \mathrm{c_{max}} | \( \mathrm{c_{max}} \) |
Metadataβ
MathJSON object literals may be annotated with supplemental information.
A number represented as a JSON number literal, a symbol or string represented as a JSON string literal, or a function represented as a JSON array must be transformed into the equivalent object literal to be annotated.
The following metadata keys are recommended:
Key | Note |
---|---|
wikidata | A short string indicating an entry in a wikibase. This information can be used to disambiguate the meaning of an identifier. Unless otherwise specified, the entry in this key refers to an enty in the wikidata.org wikibase |
comment | A human readable plain string to annotate an expression, since JSON does not allow comments in its encoding |
documentation | A Markdown-encoded string providing documentation about this expression. |
latex | A visual representation in LaTeX of the expression. This can be useful to preserve non-semantic details, for example parentheses in an expression or styling attributes |
sourceUrl | A URL to the source of this expression |
sourceContent | The source from which this expression was generated. It could be a LaTeX expression, or some other source language. |
sourceOffsets | A pair of character offsets in sourceContent or sourceUrl from which this expression was produced |
hash | A string representing a digest of this expression. |
{
"sym": "Pi",
"comment": "The ratio of the circumference of a circle to its diameter",
"wikidata": "Q167",
"latex": "\\pi"
}
{
"sym": "Pi",
"comment": "The greek letter β",
"wikidata": "Q168",
}
MathJSON Standard Libraryβ
This document defines the structure of MathJSON expression. The MathJSON Standard Library defines a recommended vocabulary to use in MathJSON expressions.
Before considering inventing your own vocabulary, check if the MathJSON Standard Library already provides relevant definitions.
The MathJSON Standard Library includes definitions for:
Topic | |
---|---|
Arithmetic | Add Multiply Power Exp Log ExponentialE ImaginaryUnit ... |
Calculus | D Derivative Integrate ... |
Collections | List Reverse Filter ... |
Complex | Real Conjugate ComplexRoots ... |
Control Structures | If Block Loop ... |
Core | Declare Assign Error LatexString ... |
Functions | Function Apply Return ... |
Logic | And Or Not True False ForAll ... |
Sets | Union Intersection EmptySet RealNumbers Integers ... |
Special Functions | Gamma Factorial ... |
Statistics | StandardDeviation Mean Erf ... |
Styling | Delimiter Style ... |
Trigonometry | Pi Cos Sin Tan ... |
When defining a new function, avoid using a name already defined in the Standard Library.