From XML to PDF via LaTeX

LaTeX

LaTeX version of eLML website (click for downlaod!) LaTeX version of eLML website (click for downlaod!) Before reading this chapter have a look at the LaTeX (PDF) version of the eLML website. For more information about LaTeX: here's the link to the official LaTeX website.

term LaTeX is a typesetting document markup language and provides a set of macros essentially based on TeX which was founded 1977 by Donald Knuth. The idea of TeX (and LaTeX) was to focus the author's work onto the content, without the necessity to deal with the visual representation of his writing. Originally thought as an environment for scientists to typeset complicated formulas on a non-layouting system, it grew to an platform independent and perfectly scaling instrument for publishing journals and books.

LaTeX is able to handle list of figures and tables dynamically, gives support for footnotes and bibliographic citations and generates table of contents and indexes automatically. There are a lot of packages (macros) available to extend the basic functionality of LaTeX, all of them searchable in an internet archive called CTAN.

LaTeX can be typesetted using any vanilla text editor. The .tex document the gets processed by the LaTeX/TeX system to an intermediary output file format called DVI ("DeVice Independent" file format). From there it the author produce files in PostScript or PDF format. TeX/LaTeX systems exists for many platforms including MS Windows, Linux, Mac OS.

BibTeX

BibTeX is used to organize the bibliography of the lesson in LaTeX and to produce the correct linking between citation and bibliography entry. This is achieved by extracting every citation in the document and associating it to the according entry in the bibliography database (ie. the .bib file). The .bib file is automatically generated during the XML->LaTeX transformation based on the entries under the bibliography element.

Installation of the TeX System

Windows

  • Download proTeXt from here and follow the installation wizard's instructions.

Mac OS X

  • Download MacTeX from here as a Installation image.
  • The package manager will install the TeX utility programs under /Applications/TeX and the unix binaries under /usr/texbin (will be added to the PATH environment variable).

If you are working with Eclipse we recommend using a plugin like Texlipse.

Transformation from XML to LaTeX

  1. Open the "Introduction to Database Systems" lesson XML file in oXygen or XMLSpy.
  2. Have a look at the "latex" section of your projects configuration file. At the moment you can only define the $documentclass parameter (choose 'article' or 'book') but more parameters might follow.
  3. Choose the file ../../../../core/presentation/latex/elml.xsl file as input XSLT file (in oXygen you also have to define an output folder: enter e.g. output.txt but it doesn't really matter since the exact paths for storing files are part of the XSLT 2.0 files anyway).

Two output files will be created: ./latex/<lesson label>.tex, the LaTeX file and ./latex/<lesson label>.bib, the bibliography file.

Typesetting the tex file

  • Open the generated .tex Document in proTeXt/MacTeX, select the correct input format (e.g. LaTeX), typeset it to the preferred output format (e.g. pdflatex); to achieve a correct table of contents and figure/table listing it's recommended to typeset the same document several (e.g. 2 to 3) times.

Building the bibliography index

  • After the first LaTeX processing an auxiliary file <lesson label>.aux is generated. A subsequent processing of the .tex file in BibTeX builds a bibliography index based on citation marks found in the .aux file and bibliography entries found in the .bib file. The index will be located at the end of the .tex file.
  • After BibTeX-ing the .tex file it's necessary to typeset it once again in LaTeX to generate a PDF with a correct bibliography index.
The workflow in eLML from XML via XSLT to LaTeX The workflow in eLML from XML via XSLT to LaTeX

Known limitations

  • Images: LaTeX supports PDF, JPG and PNG as inline image formats. GIF isn't supported; these images must first be converted to a supported file format, e.g. by bulk converting them using GraficConverter
  • Tables: In contrast to tables in html output format, tables in LaTeX format are much more difficult to transform due to the fundamental differences of these technologies in handling tables. It is often impossible to guess the right cell widths for LaTeX tables in the actual context. Furthermore tables width an empty first row or vertical colspans won't transform correctly.


up