plasTeX 3.0 — A Python Framework for Processing LaTeX Documents

5 Renderers

Renderers allow you to convert a plasTeX document object into viewable output such as HTML, RTF, or PDF, or simply a data structure format such as DocBook or tBook. Since the plasTeX document object gives you everything that you could possibly want to know about the LaTeX document, it should, in theory, be possible to generate any type of output from the plasTeX document object while preserving as much information as the output format is capable of. In addition, since the document object is not affected by the rendering process, you can apply multiple renderers in sequence so that the LaTeX document only needs to be parsed one time for all output types.

While it is possible to write a completely custom renderer, a couple of renderer implementations are included with the plasTeX framework. While the rendering process in this implementation is fairly simple, it is also very powerful. Some of the main features are listed below.

  • ability to generate multiple output files

  • automatic splitting of files is configurable by section level, or can be invoked using ad-hoc methods in the filenameoverride property

  • powerful output filename generation utility

  • image generation for portions of the document that cannot be easily rendered in a particular output formate (e.g. TikZ pictures in HTML)

  • theming support

  • hooks for post-processing of output files

  • configurable output encodings

The API of the renderer itself is very small. In fact, there are only a couple of methods that are of real interest to an end user: render and cleanup. The render method is the method that starts the rendering process. Its only argument is a plasTeX document object. The cleanup method is called at the end of the rendering process. It is passed the document object and a list of all of the files that were generated. This method allows you to do post-processing on the output files. In general, this method will probably only be of interest to someone writing a subclass of the Renderer class, so most users of plasTeX will only use the render method. The real work of the rendering process is handled in the Renderable class which is discussed later in this chapter.

The Renderer class is a subclass of the Python dictionary. Each key in the renderer corresponds to the name of a node in the document object. The value stored under each key is a function. As each node in the document object is traversed, the renderer is queried to see if there is a key that matches the name of the node. If a key is found, the value at that key (which must be a function) is called with the node as its only argument. The return value from this call must be a string object that contains the rendered output. Based on the configuration, the renderer will handle all of the file generation and encoding issues.

If a node is traversed that doesn’t correspond to a key in the renderer dictionary, the default rendering method is called. The default rendering method is stored in the default attribute. One exception to this rule is for text nodes. The default rendering method for text nodes is actually stored in textDefault. Again, these attributes simply need to reference any Python function that returns a string object of the rendered output. The default method in both of these attributes is the str built-in function.

As mention previously, most of the work of the renderer is actually done by the Renderable class. This is a mixin class 1 that is mixed into the Node class in the render method. It is unmixed at the end of the render method. The details of the Renderable class are discussed in section 5.2.

  1. A mixin class is simply a class that is merely a collection of methods that are intended to be included in the namespace of another class.