[Om3] Language Dictionaries (Was: Re: Initializing OM3 at 2013 Process)

Michael Kohlhase m.kohlhase at jacobs-university.de
Fri May 2 19:09:03 CEST 2014


Dear Lars,


On 30.4.14 09:23, Lars Hellström wrote:
> Michael Kohlhase skrev 2014-04-23 15.24:
>> Dear all,
>>
>> In thinking about OpenMath3, we have problem that there are some
>> language extension proposals, but at the same time, we have the problem
>> that we have to keep MathML3 compatibility.
>
> No, we don't (as far as OM3 is concerned); that Unicode 1.1 became
> ISO/IEC 10646-1:1993 didn't prevent the creation of Unicode 2.0 (which
> introduced notable features such as surrogate pairs). It may still be
> desirable to stay compatible with MathML3, but it cannot be an a
> priori limit on the evolution of OpenMath.

--> separate thread established by David/you.
>
>> This has been bothering me,
>> but I think I have the beginnings of a solution. I would like to submit
>> the blue note attached to the discussion of our committee.
>
> I have to admit that at first sight, this made me suspect ulterior
> motives. No activity for five months, and then the president suddenly
> drops a polished-looking note "for discussion" -- is the committee
> being railroaded? But no, once past the first page it becomes clear
> that this suggestion is indeed barely hatched, and definitely at a
> stage that invites to _genuine_ discussion. I find the polished look a
> bit curious (why bother so much about the appearance of something so
> preliminary?), as I would probably have gone for a plain (but longish)
> email instead, but different communities have different traditions.
> Maybe it needs to be this polished to get a reaction within the OM
> community?
As I said, I wanted to write this down in sufficient detail so that I
could convince myself that something like this _can_ be done. I should
probably have clarified in my initial e-mail that this does not even
necessarily say _how_ it is best done. My main motivation was to see if
we can get around the conundrum of the linked standards.
>
>> Please give me feedback,
>
> Right, time for some quoting! (Sorry if the indentation is wrong
> somewhere; the newlines didn't make it through when I copied from the
> PDF.)
>
>> <OMLangExt xmlns="http://www.openmath.org/OpenMathCD">
>
> Surely you don't propose to bundle this with the CD namespace? The
> differences are considerable, so it should have its own.
now that you raise the issue, I proabaly agree.
>>  <OLEName>seq</OLEName> <OLEDate>2014-04-22</OLEDate>
>>  <OLEStatus>experimental</OLEStatus>
>>  <OLEVersion>1</OLEVersion>
>>  <OLERevision>1</OLERevision>
>
> Not even the few elements that are common have the same names...
>
>>  <schemaext>
>>    OMNATS = element OMNATS {omel}
>>    OMNTH = element OMNTH {omel,omel}
>>    omel |= OMNTH | OMNATS
>>  </schemaext>
>
> Mixing XML and non-XML syntax for something that actually has an XML
> syntax?? 
technically that is no problem, the content is just a text node
> It certainly enhances readability, but no, that's just wrong.
I am very open here, RNC and RNG are intercompilable.
>
>>  <equality>
>>    <OMBIND>
>>      <OMS cd="quant1" name="forall"/>
>>      <OMBVAR><OMV name="n"/><OMV name="m"/></OMBVAR>
>>      <OMA><OMS cd="logic1" name="implies"/>
>>        <OMA><OMS cd="relation1" name="gt"/>
>>          <OMV name="m"/>
>>          <OMV name="n"/>
>>        </OMA>
>>        <OMA><OMS cd="relation1" name="eq"/>
>>          <OMNTH>
>>            <OMI>n</OMI>
>
> I presume you meant <OMV name="n"/>?
yes, of course thanks.
>
>>            <OMNATS><OMI>m</OMI></OMNATS>
>>          </OMNTH>
>>          <OMI>n</OMI>
>>        </OMA>
>>      </OMA>
>>    </OMBIND>
>>  </equality>
>
> So the point of this part is effectively to state FMPs (what about
> CMPs?) for the new elements, since there is no separate CD in which to
> state those.
that is indeed the intention. Actually, there are similar FMPs and CMPs
in the accompanying CD that justifies the language extension. I am just
not sure whether that is enough.
> Many FMPs are indeed equalities, but not all, so it feels odd to name
> that section <equality>. The one class of properties of these symbols
> that definitely are equalities -- that of a new element and its rule
> based translation -- isn't stated here, but in the subsequent
> <translation> section! 
HMMM, that is correct. I had viewed the OM2 and OM+Seq as two different
languages that I would not mix, but maybe your view is better. I have to
think about that.
> (An additional quirk is that this latter type of equality probably
> isn't the colloquial relation1#eq, but an absolute "there should be
> absolutely no way within the formal system to tell the difference
> between these two" invariance under substitution equality.)
Indeed, nice idea. So you are proposing a meta-level equality symbol for
these (but not for the inter-language equalities); I have to think about
this some more.
>
> I understand EdNote2 as saying that this particular FMP is
> superfluous, as it would anyway be implied by a similar FMP in the
> argseq CD. I tend to agree, but note that could well be properties of
> new notation that isn't as natural to state for ordinary symbols.
yes, that was what I was wondering as well. I guess we need more examples.
>
>>  <translation cd="argseq">
>
> Now this is the really heavy part of the proposal, but also that part
> which is most vaguely explained. Is this supposed to be XSLT, or
> something homegrown inspired by the same?
this is the (homegrown) XML rewriting system we use to represent simple
tree transformations declaratively. You are right, it should have been
explained in the note. Let me try now.

The idea is that the <rule> elements contain pairs of expression
schemata (expressions with meta-variables for expressions and expression
lists). Meta-variables in the head (the first child of the rule) are
represented by <expr> and <exprlist> elements (in the latter, <expr>
ranges over the elements of the list). In the body of the rule (the
second child), meta-variables are represented by <render> and <iterate>
elements. Corresponding meta-variables have the same name. A rule is
applied to an expression, if it matches a subterm, which is replaced by
the body with the meta-variables recursively replaced.

In principle we could also use XSLT here, but that is much more
expressive and would open a great can of worms.
>
>>    <rule>
>>      <OMNTH>
>>        <expr name="n"/>
>>        <exprlist name="seq"> <expr name="elt"/> </exprlist>
>>      </OMNTH>
>>      <OMA>
>>        <OMS cd="seqs" name="nth"/>
>
> Does that cd="argseq" attribute on the translation element even mean
> anything? It appears that the symbols that the rules are generating
> rather reside in the seqs CD.
Right, I wanted to put the information somewhere which CDs are needed to
interpret the rule bodies.
>
>>        <render name="n"/>
>>        <iterate name="seq">
>>          <render name="elt"/>
>>        </iterate>
>>      </OMA>
>>    </rule>
>>    <rule>
>>      <OMNATS>
>>        <expr name="n"/>
>>      </OMNATS>
>>      <OMA>
>>        <OMS cd="seqs" name="nats"/>
>>        <render name="n"/>
>>      </OMA>
>>    </rule>
>>  </translation>
>> </OMLangExt>
>
> A problem with this example is that it seems to be aimed at showing
> off the file format (which at this point must be considered very
> preliminary) rather than the underlying mechanism; the translations it
> performs are very trivial -- changing <OMNATS> to <OMA><OMS cd="seqs"
> name="nats"/> and so on could equally well be done by substring
> replacements! -- and the language extension achieved is highly
> uncompelling. Having special elements for straightforward combinations
> of OMA and OMS gains practically nothing, but the cost in increased
> language complexity is considerable.
I agree with you here. But I wanted to see if the idea of
"language-extensions-justified-by-CDs" could fly at all before investing
more work on these things.
> What would be a more interesting example is a set of translations that
> cover the whole sequences proposal. Inventing translations for OMNTH
> and OMNATS is one thing, but OMSEQ and OMSV are much tougher customers
> (at least if one wants to achieve the intended semantics; it may be
> that the above already gets it wrong with OMNATS). It is not at all
> obvious that you would succeed. And is there a proposed standard
> extension other than the sequences one for which this OMLangExt
> mechanism would even apply? I don't recall any example of any other
> introducing new XML elements, so how does one deal with other language
> extensions?
Another language extension I would really like to see is record
structures. That proposal has been made a long time ago, but never
really pushed. I think we need them for such things as: "Let
$(R,+,*,0,1,-)$ be a ring." I would like to represent this as "Let
[base=R,addop=+,mulop=*,addneut=0,mulneut=1,addinv=-] be a ring". So
that we have accessor symbols base, addop, mulop, addneut, mulneut,
addinv which have natural counterparts in mathematical language. Note
that I am not necessarily advocating this 5-record over the record
[addstruct=[base=R,op=+,neut=0,inv=-],mulstruct=[base=R,op=*,neut=1]], I
think we need records and accessors in both cases. \

I know (in principle) how a OMLangExt for this could be written, but
have not developed one yet.
> This last point _could_ be taken as an argument in favour of the above
> with respect to another matter, namely the organisation of data within
> the OMLangExt. As it is, the schemaext, equality, and translation
> sections are fairly close to being three separate files bundled into
> one; they are syntactically somewhat disparate. Content dictionaries
> instead organise data per symbol, which often has the advantage of
> keeping related pieces of information close together. Rewriting the
> above example in this way (not fixing any of the details remarked
> about) would produce something like
>
> <OMLangExt xmlns="http://www.openmath.org/OpenMathCD">
>  <OLEName>seq</OLEName> <OLEDate>2014?04?22</OLEDate>
>  <OLEStatus>experimental</OLEStatus>
>  <OLEVersion>1</OLEVersion>
>  <OLERevision>1</OLERevision>
>
>  <OLEDefinition>
>    <Name>OMNTH</Name>
>    <schemaext>
>      omel |= element OMNTH {omel,omel}
>    </schemaext>
>    <rule>
>      <OMNTH>
>        <expr name="n"/>
>        <exprlist name="seq"> <expr name="elt"/> </exprlist>
>      </OMNTH>
>      <OMA>
>        <OMS cd="seqs" name="nth"/>
>        <render name="n"/>
>        <iterate name="seq">
>          <render name="elt"/>
>        </iterate>
>      </OMA>
>    </rule>
>  </OLEDefinition>
>
>  <OLEDefinition>
>    <Name>OMNATS</Name>
>    <schemaext>
>      omel |= element OMNATS {omel}
>    </schemaext>
>    <rule>
>      <OMNATS>
>        <expr name="n"/>
>      </OMNATS>
>      <OMA>
>        <OMS cd="seqs" name="nats"/>
>        <render name="n"/>
>      </OMA>
>    </rule>
>    <FMP>
>      <OMBIND>
>        <OMS cd="quant1" name="forall"/>
>        <OMBVAR><OMV name="n"/><OMV name="m"/></OMBVAR>
>        <OMA><OMS cd="logic1" name="implies"/>
>          <OMA><OMS cd="relation1" name="gt"/>
>            <OMV name="m"/>
>            <OMV name="n"/>
>          </OMA>
>          <OMA><OMS cd="relation1" name="eq"/>
>            <OMNTH>
>              <OMI>n</OMI>
>              <OMNATS><OMI>m</OMI></OMNATS>
>            </OMNTH>
>            <OMI>n</OMI>
>          </OMA>
>        </OMA>
>      </OMBIND>
>    </FMP>
>  </OLEDefinition>
> </OMLangExt>
>
> It is not clear that one is better than the other, but both are possible.
I have thought about this as well, and in principle I like the
per-element extension better in spirit. But I had trouble where to put
the equalities.

I am completely open to both ways of organizing things, and did not
really want to make a proposal to which one would be better in the long
run.
>
> That's all I have on this matter right now.

I guess the question I really want feedback on is whether we should go
for a OMLangExt-like scheme of extending OpenMath to OM3, which would
allow us to keep OM2  as a base (and thus be MathML-compatible) while
extending to (a set of) OM3 languages we can experiment with in the next
years. Then we can (in a couple of years) see which language extensions
can be standardized further.

Michael
>
>
> Lars Hellström
>
>
> _______________________________________________
> Om3 mailing list
> Om3 at openmath.org
> http://openmath.org/mailman/listinfo/om3

-- 
----------------------------------------------------------------------
 Prof. Dr. Michael Kohlhase,        Office: Research 1, Room 168
 Professor of Computer Science  Campus Ring 1, 
 Jacobs University Bremen           D-28759 Bremen, Germany
 tel/fax: +49 421 200-3140/-493140  skype: m.kohlhase   
 m.kohlhase at jacobs-university.de http://kwarc.info/kohlhase 
----------------------------------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: m_kohlhase.vcf
Type: text/x-vcard
Size: 332 bytes
Desc: not available
URL: <http://openmath.org/pipermail/om3/attachments/20140502/7ac871e7/attachment.vcf>


More information about the Om3 mailing list