[Om3] Language Dictionaries (Was: Re: Initializing OM3 at 2013 Process)
Lars Hellström
Lars.Hellstrom at residenset.net
Wed Apr 30 09:23:37 CEST 2014
Michael Kohlhase skrev 2014-04-23 15.24:
> Dear all,
>
> In thinking about OpenMath3, we have problem that there are some
> language extension proposals, but at the same time, we have the problem
> that we have to keep MathML3 compatibility.
No, we don't (as far as OM3 is concerned); that Unicode 1.1 became ISO/IEC
10646-1:1993 didn't prevent the creation of Unicode 2.0 (which introduced
notable features such as surrogate pairs). It may still be desirable to stay
compatible with MathML3, but it cannot be an a priori limit on the evolution
of OpenMath.
> This has been bothering me,
> but I think I have the beginnings of a solution. I would like to submit
> the blue note attached to the discussion of our committee.
I have to admit that at first sight, this made me suspect ulterior motives.
No activity for five months, and then the president suddenly drops a
polished-looking note "for discussion" -- is the committee being railroaded?
But no, once past the first page it becomes clear that this suggestion is
indeed barely hatched, and definitely at a stage that invites to _genuine_
discussion. I find the polished look a bit curious (why bother so much about
the appearance of something so preliminary?), as I would probably have gone
for a plain (but longish) email instead, but different communities have
different traditions. Maybe it needs to be this polished to get a reaction
within the OM community?
> Please give me feedback,
Right, time for some quoting! (Sorry if the indentation is wrong somewhere;
the newlines didn't make it through when I copied from the PDF.)
> <OMLangExt xmlns="http://www.openmath.org/OpenMathCD">
Surely you don't propose to bundle this with the CD namespace? The
differences are considerable, so it should have its own.
> <OLEName>seq</OLEName> <OLEDate>2014-04-22</OLEDate>
> <OLEStatus>experimental</OLEStatus>
> <OLEVersion>1</OLEVersion>
> <OLERevision>1</OLERevision>
Not even the few elements that are common have the same names...
> <schemaext>
> OMNATS = element OMNATS {omel}
> OMNTH = element OMNTH {omel,omel}
> omel |= OMNTH | OMNATS
> </schemaext>
Mixing XML and non-XML syntax for something that actually has an XML
syntax?? It certainly enhances readability, but no, that's just wrong.
> <equality>
> <OMBIND>
> <OMS cd="quant1" name="forall"/>
> <OMBVAR><OMV name="n"/><OMV name="m"/></OMBVAR>
> <OMA><OMS cd="logic1" name="implies"/>
> <OMA><OMS cd="relation1" name="gt"/>
> <OMV name="m"/>
> <OMV name="n"/>
> </OMA>
> <OMA><OMS cd="relation1" name="eq"/>
> <OMNTH>
> <OMI>n</OMI>
I presume you meant <OMV name="n"/>?
> <OMNATS><OMI>m</OMI></OMNATS>
> </OMNTH>
> <OMI>n</OMI>
> </OMA>
> </OMA>
> </OMBIND>
> </equality>
So the point of this part is effectively to state FMPs (what about CMPs?)
for the new elements, since there is no separate CD in which to state those.
Many FMPs are indeed equalities, but not all, so it feels odd to name that
section <equality>. The one class of properties of these symbols that
definitely are equalities -- that of a new element and its rule based
translation -- isn't stated here, but in the subsequent <translation>
section! (An additional quirk is that this latter type of equality probably
isn't the colloquial relation1#eq, but an absolute "there should be
absolutely no way within the formal system to tell the difference between
these two" invariance under substitution equality.)
I understand EdNote2 as saying that this particular FMP is superfluous, as
it would anyway be implied by a similar FMP in the argseq CD. I tend to
agree, but note that could well be properties of new notation that isn't as
natural to state for ordinary symbols.
> <translation cd="argseq">
Now this is the really heavy part of the proposal, but also that part which
is most vaguely explained. Is this supposed to be XSLT, or something
homegrown inspired by the same?
> <rule>
> <OMNTH>
> <expr name="n"/>
> <exprlist name="seq"> <expr name="elt"/> </exprlist>
> </OMNTH>
> <OMA>
> <OMS cd="seqs" name="nth"/>
Does that cd="argseq" attribute on the translation element even mean
anything? It appears that the symbols that the rules are generating rather
reside in the seqs CD.
> <render name="n"/>
> <iterate name="seq">
> <render name="elt"/>
> </iterate>
> </OMA>
> </rule>
> <rule>
> <OMNATS>
> <expr name="n"/>
> </OMNATS>
> <OMA>
> <OMS cd="seqs" name="nats"/>
> <render name="n"/>
> </OMA>
> </rule>
> </translation>
> </OMLangExt>
A problem with this example is that it seems to be aimed at showing off the
file format (which at this point must be considered very preliminary) rather
than the underlying mechanism; the translations it performs are very trivial
-- changing <OMNATS> to <OMA><OMS cd="seqs" name="nats"/> and so on could
equally well be done by substring replacements! -- and the language
extension achieved is highly uncompelling. Having special elements for
straightforward combinations of OMA and OMS gains practically nothing, but
the cost in increased language complexity is considerable.
What would be a more interesting example is a set of translations that cover
the whole sequences proposal. Inventing translations for OMNTH and OMNATS is
one thing, but OMSEQ and OMSV are much tougher customers (at least if one
wants to achieve the intended semantics; it may be that the above already
gets it wrong with OMNATS). It is not at all obvious that you would succeed.
And is there a proposed standard extension other than the sequences one for
which this OMLangExt mechanism would even apply? I don't recall any example
of any other introducing new XML elements, so how does one deal with other
language extensions?
This last point _could_ be taken as an argument in favour of the above with
respect to another matter, namely the organisation of data within the
OMLangExt. As it is, the schemaext, equality, and translation sections are
fairly close to being three separate files bundled into one; they are
syntactically somewhat disparate. Content dictionaries instead organise data
per symbol, which often has the advantage of keeping related pieces of
information close together. Rewriting the above example in this way (not
fixing any of the details remarked about) would produce something like
<OMLangExt xmlns="http://www.openmath.org/OpenMathCD">
<OLEName>seq</OLEName> <OLEDate>2014?04?22</OLEDate>
<OLEStatus>experimental</OLEStatus>
<OLEVersion>1</OLEVersion>
<OLERevision>1</OLERevision>
<OLEDefinition>
<Name>OMNTH</Name>
<schemaext>
omel |= element OMNTH {omel,omel}
</schemaext>
<rule>
<OMNTH>
<expr name="n"/>
<exprlist name="seq"> <expr name="elt"/> </exprlist>
</OMNTH>
<OMA>
<OMS cd="seqs" name="nth"/>
<render name="n"/>
<iterate name="seq">
<render name="elt"/>
</iterate>
</OMA>
</rule>
</OLEDefinition>
<OLEDefinition>
<Name>OMNATS</Name>
<schemaext>
omel |= element OMNATS {omel}
</schemaext>
<rule>
<OMNATS>
<expr name="n"/>
</OMNATS>
<OMA>
<OMS cd="seqs" name="nats"/>
<render name="n"/>
</OMA>
</rule>
<FMP>
<OMBIND>
<OMS cd="quant1" name="forall"/>
<OMBVAR><OMV name="n"/><OMV name="m"/></OMBVAR>
<OMA><OMS cd="logic1" name="implies"/>
<OMA><OMS cd="relation1" name="gt"/>
<OMV name="m"/>
<OMV name="n"/>
</OMA>
<OMA><OMS cd="relation1" name="eq"/>
<OMNTH>
<OMI>n</OMI>
<OMNATS><OMI>m</OMI></OMNATS>
</OMNTH>
<OMI>n</OMI>
</OMA>
</OMA>
</OMBIND>
</FMP>
</OLEDefinition>
</OMLangExt>
It is not clear that one is better than the other, but both are possible.
That's all I have on this matter right now.
Lars Hellström
More information about the Om3
mailing list