Hermes - a semantic XML+MathML+Unicode e-publishing/self-archiving tool for LaTeX authored scientific articles

Download latest version: 0.9.12, released on 28 Nov. 2006
Last update on by Romeo Anghelache
Users, developers and/or philosophers are invited to ask/send comments on the Hermes blog or to send their comments directly to the author.

Examples

Some results of Hermes assisted conversions are hosted here; the source distribution also contains an article in LaTeX source, as well as a content-oriented source sample.

What is Hermes?

Hermes is a grammar based translator from (AMS)LaTeX to Unicode(utf-8) encoded XML+MathML+metadata. It is free software (software libre).
Translating pure (AMS)TeX documents is not yet supported by Hermes, but this facility will be available sooner or later, depending on user interest.

What for?

Hermes is here to help individuals at self-archiving, libraries at long term-archiving, and publishers at having a reference document for their various specific services.

How does it work?

Hermes follows the steps below, in the specified order:
  1. semantically seeds a copy of your TeX source
  2. lets the TeX program do its job (texing) on this semantically enriched source
  3. parses the resulting semantic dvi
  4. generates the XML reference document, a semantic XML reflection of your TeX source.
It works on Linux, Windows and OS X.

What is the Hermes reference document?

It is a Unicode XML document with a generic structure, containg free text  and various XML vocabularies.
It contains the semantics Hermes managed to recover from the LaTeX source.
Its validating XML-Schema will get published after this generic structure gets less fluid.
Currently, the generic structure consists of:
  1. sections
  2. presentation hints (currently font names and sizes),
  3. free text ((accented)TeX glyphs mapped to their Unicode equivalent),
  4. metadata (title, author, date etc.)
  5. bibliography,
  6. internal and external references (no need for special LaTeX packages to get these activated in the XML),
  7. tables, images
These items are in a one-to-one relationship with the corresponding structures in the source/semantic dvi. This list is extensible: LaTeX environments automatically produce an XML structure.
The XML vocabularies reflect the vocabularies used in the LaTeX source, e.g. mathematical regions in the LaTeX source correspond to MathML regions in the reference document.
MathML is the only validable XML vocabulary implemented and supported currently by Hermes (SVG, and other vocabularies, like MARC, or other open standards, may follow,  if users are interested).
Of MathML, only MathML-presentation is generated if Hermes is used to translate legacy LaTeX files (here, by legacy LaTeX files I mean sources which were not edited with semantic vocabularies in mind) without manual intervention on the source.
MathML-content can only be generated if a newly authored LaTeX source uses the semantic LaTeX macros available in the Hermes distribution.

Installation requirements

A standard latex system, gcc, bison, flex, make and libxml/xslt should be on your system, in order to compile the program and have the proper example output (Windows developers can check out the Cygwin distribution, windows users will have a binary distribution (hermes.exe and seed.exe) issued (almost) synchronously with the source distribution.).
Developers and Unix users can unpack the source distro and run make.
After a successful 'make' you get:

General use

Follow the steps below:
'Validate' your source:
  1. - write an (AMS)LaTeX text containing mathematical expressions; LaTeX it and fix all your editing errors ;).
  2. - latex document.tex, if you didn't get a dvi return to step 1
Use Hermes to get the reference document (library) and renderable (publish) XML files:
  1. - run ./seed document.tex, if you didn't get document.s.tex go to found-a-bug
  2. - latex document.s.tex, if you didn't get a document.s.dvi go to found-a-bug
  3. - run ./hermes document.s.dvi >document.lib.xml, if you didn't get a document.lib.xml go to found-a-bug
  4. - run xsltproc pub.xslt document.lib.xml > document.pub.xml, if you didn't get a document.pub.xml go to found-a-bug
  5. - now you can archive or send document.lib.xml to your library, and post your document.pub.xml on your website, along with the MathML-stylesheets for others to read/reuse.
found-a-bug:
either let the author know, fix it or ask around.

Architecture of Hermes

Developer's tips

To do

Credits

Hermes is covered by GNU GPL, and developed by Romeo Anghelache. It was created in the EU funded MoWGLI research project (ended in Feb. 2005), as a task for LivingReviews, from Max Planck Institute for Gravitational Physics, Golm, Germany.
Its further development was partially supported by :

Alternative tools, developed by other fellows

Valid XHTML 1.0!