[Om] Mathematical Vernacular in formulae
Michael Kohlhase
m.kohlhase at jacobs-university.de
Tue Jan 25 08:46:02 CET 2011
Dear all,
Here is an issue that has been bothering me for a while. I am asking you
for your advice.
as you know, we use OpenMath for content markup of formulae in natural
language in OMDoc. For instance, we write something like
<p>This is the way we write a sum:
<OMOBJ>
<OMA>
<OMS cd="arith1" name="plus"/>
<OMV name"x"/>
<OMI>1</OMI>
</OMA>
</OMA>,
simple isn't it?
</p>
This is then transformed to parallel Math Markup using a user-adaptable
presentation process. We can even generate the markup above from a
special version of LaTeX... But you probably know all this already.
NOW comes the problem. In mathematical texts we often find constructions
that have natural language _inside_ mathematical formulae. For instance,
I found (using TeX for simplicity; what a pity we cannot use MathML in
e-mails yet)
$\{\langle a,b\rangle\bigl|\text{$a\in T$ and $P$ terminates for $a$ with $b$}\}$
There are multiple other examples and you have probably seen many of
them. But how do we mark this up in OpenMath (should we at all?).
For the moment, we came up with the following markup
<om:OMOBJ>
<om:OMA>
<om:OMS cd="sets-introduction" name="setst"/>
<om:OMA>
<om:OMS cd="sets-operations" name="tup"/>
<om:OMV name="a"/>
<om:OMV name="b"/>
</om:OMA>
<om:OMSTR>
<om:OMOBJ>
<om:OMA>
<om:OMS cd="sets-introduction" name="inset"/>
<om:OMV name="a"/>
<om:OMS cd="terms" name="terms"/>
</om:OMA>
</om:OMA>
</om:OMOBJ>
and
<om:OMOBJ><om:OMV name="P"/></om:OMOBJ>
terminates for
<om:OMOBJ><om:OMV name="a"/><om:OMSTR>
with
<om:OMOBJ><om:OMV name="b"/></om:OMOBJ>
</om:OMSTR>
</om:OMA>
</om:OMOBJ>
It (ab-)uses an OMSTR element for the text element and embeds OpenMath
Objects into it. This is clearly not right, since OMSTR was not meant
for such usages. In MathML we would be slightly better off, we could
just escape presentation MathML (and you may want to say that this is
the correct thing to do). This would be
<math>
<apply>
<csymbol cd="sets-introduction">setst</csymbol>
<apply>
<csymbol cd="sets-operations">tup</csymbol>
<ci>a</ci>
<ci>b</ci>
</apply>
<mtext>
<math>
<apply>
<csymbol cd="sets-introduction">inset</csymbol>
<ci>a</ci>
<csymbol cd="terms">terms</csymbol>
</apply>
</apply>
</math>
and
<math><ci>P</ci></math>
terminates for
<math><ci>a</ci></math>
with
<math><ci>b</ci>></math>
</mtext>
</apply>
</math>
which is slightly more palatable semantically, since it does not treat
the natural language as a string.
The correct (if very tedious) thing to do is probably something like
<math>
<apply>
<csymbol cd="sets-introduction">setst</csymbol>
<apply>
<csymbol cd="sets-operations">tup</csymbol>
<ci>a</ci>
<ci>b</ci>
</apply>
<sematnics>
<mtext>
<math><share src="f1"/></math>
and
<math><share src="f2"/></math>
terminates for
<math><share src="f3"/></math>
with
<math><share src="f4"/></math>
</mtext>
<annotation-xml encoding="Content-MathML"/>
<apply>
<csymbol cd="logic1">and</csymbol>
<apply id="f1">
<csymbol cd="sets-introduction">inset</csymbol>
<ci>a</ci>
<csymbol cd="terms">terms</csymbol>
</apply>
<apply>
<ci>terminates-with-on</ci>
<ci id="f2">P</ci>
<ci id="f3">a</ci>
<ci id="f4">b</ci>>
</apply>
</apply>
</semantics>
</apply>
</math>
For this we are using a variant of parallel markup with the new <share>
element in OpenMath3 (which corresponds to <OMR> elemen) since we want
to keep to content markup as far as possible.
Tell me what you think; how should we deal with such situations? Maybe
OpenMath should have a (restricted) way to escape to mathematical
vernacular into formulae via an OMNL element that allows to encapsulate
natural language (i.e. that could be used where OMSTR was abused in the
first example)?
best,
Michael
More information about the Om
mailing list