[Om] WebSci paper on (Mathematical) Semantics of Governmental Statistics Data

Christoph LANGE ch.lange at jacobs-university.de
Tue Mar 30 00:50:05 CEST 2010


Dear OpenMath community,

  we have submitted the final version of the WebSci paper on governmental
statistics data that I had mentioned before to some of you:
http://kwarc.info/clange/pubs/websci10-SemGovStatData.pdf.  Let me shortly
introduce the topic.  This paper entails a bit of an agenda for integrating
OpenMath more closely with the semantic web; see below.

In a nutshell:  More and more governments publish their statistical datasets
online.  They publish them as linked data (a best practice for publishing
RDF).  The advantage over e.g. certain XML formats (which also exist) is that
RDF linked data are easier to integrate with other [heterogeneous] datasets,
thus enabling heterogeneous queries.  The semantics of the data, as they are
published at the moment, is quite shallow, and we tried to improve on that.
One aspect is grounding the real-world things covered by the statistics in
vocabularies that explain what they really are about.  The other aspect
(section 4) is the mathematical semantics of derived data points.  We do that
by pointing from derived data to OpenMath symbols that represent the functions
used for computing derived values, e.g.  arith1#divide or s_data1#mean, or
user-/application-defined ones.  By translating these "RDF pointers to
functions" into real OpenMath objects, applications dealing with statistics
datasets annotated that way can then connect to OpenMath-aware systems in
order to verify existing derived values or to compute new ones.

In a way, such integration of OpenMath with the semantic web is similar to
what has been done with MONET before (not quite the same though), but now that
there are linked data and the SPARQL query language (which didn't exist back
then), integration requires much less infrastructure and has much more
potential w.r.t. heterogeneity.  Just look at http://linkeddata.org/ to get an
impression.  I would like the OpenMath CDs to become one node in that graph,
hopefully soon being linked from many other datasets.  The conference in one
month will show how many people are interested in the connection of RDF
datasets to OpenMath, but in any case these government data are a big and
realistic use case, and linked data has been one of the hottest topics in the
semantic web community for more than two years now.

Publishing the OpenMath CDs in a linked data compliant way is fortunately
quite easy.  I have started to take down the required steps in
https://trac.mathweb.org/OM3/ticket/116 and will soon bug Paul and others with
some requests.

Please let me know what you think.

Cheers,

Christoph

-- 
Christoph Lange, Jacobs Univ. Bremen, http://kwarc.info/clange, Skype duke4701
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
Url : http://openmath.org/pipermail/om/attachments/20100330/f508095e/attachment.pgp 


More information about the Om mailing list