Moving to a new blog

It has been almost three years since I last posted here. I got busy. Both in my personal and professional life. I am now attempting to restart the blog, but not on wordpress. I am moving to a static website generator and hosting my blog on github. I even wrote a new post on correcting math escape in t-vim. Go check it out and update your bookmarks!

Metapost and TeX labels

Default Metapost has the concept of two types of labels, postscript labels and TeX labels. Postscript labels are created using

label("text", location);

while TeX labels are created using

label(btex text etex, location);

In the latter case, Metapost collects everything between btex and etex in a separate file, processes that file through TeX, and includes the resulting postscript code at an appropriate location. Such a Golberg-esque mechanism is needed to propertly typeset mathematics, get proper kerning, etc.; tasks that TeX can do but Postscript cannot.

ConTeXt has always been tightly integrated with Metapost, but in the pdftex days typesetting labels was slow. ConTeXt (i.e. pdftex) calls Metapost (to draw a figure, say), and then Metapost calls pdftex (to typeset a label), and imports the result to the postscript; this postscript file is passed to ConTeXt and translated into PDF code using TeX macros and the result in inserted in the PDF file that pdftex is generating. Phew!

Six or seven years ago, Hans Hagen and Mojca Miklavec had an idea to speed up this process by collecting all the labels at the ConTeXt end, typesetting them in boxes, and pass on the dimension of the boxes to Metapost. See Mojca’s My Way describing this mechanism. To use this feature, one had to type:

label(\sometxt{text}, location);

With luatex, Metapost can be called as a library, and the basic idea of preprocessing the labels at the TeX end and passing the resulting dimensions to Metapost has been implemented more robustly in Lua. (The conversion of the PS generated by Metapost to PDF is also done in Lua). So one could just type

label(btex text etex, location);

and ConTeXt would parse the Metapost environment and do all the book-keeping at the back.

However, I had always been unstastified with the user interface. There are very few situations where I as a user want postscript labels. So, why not redefine the label macro so that

label("text", location);

is equivalent to the btexetex version. I had made such a suggestion back in 2007, and for some years had been using a private macro for such purposes.

Today, while answering a question at TEX.SX, I noticed that now there is no difference between the postscript and the TeX labels!. For example

\setupbodyfont[times]
\starttext

\startMPpage[offset=2mm]
  draw "$w$" infont defaultfont scaled defaultscale;
  draw btex $p$ etex shifted (1cm,0);
\stopMPpage
\stoptext

gives

result

Notice that both w and p are in the Times Math font. So, there is no need for those pesky btex ... etex tags anymore.

I really don’t know when this change was implemented. As far as I can tell, nothing has changed at the ConTeXt end, so it appears that MetaPost is now directly parsing the postscript labels using TeX. Nonetheless, this means that there is one less thing to worry about while learning and using Metapost. Yay!

Announcing the overview module

A few years ago, I came across impressive, which is a python script that adds extra oomph to presentations. It uses openGL for effects like highlight boxes, spotlight effect, and overview pages. If you haven’t used it, I’d definitely recommend to give it a try. I don’t use it personally as I find it to be a bit slow and unreliable, but I like the options that the script provides. So, I thought that it would be useful to implement these effects in TeX.

Since I first came across this script, I thought that it will be useful to add some of these presentation features to my workflow. It has taken a while to get around to actually do that.

The feature that I liked the most is the overview page: at the end of the presentation, show the thumbnails of the slides, and if someone has any question, you can click on the thumbnail and go to any slide in the presentation. In fact, I’d prefer not to show all the slides, but only the important ones (like the start of a new section).

Thanks to Wolfgang Schuster and Hans Hagen, I am happy to announce the overview module that provides this feature. I have yet to write the complete documentation, but since it is the new year, I thought that I’d at least annouce the module. The basic usage is as follows:

\usemodule[overviewpage][level=section]

\setupinteraction[state=start]
\setuppapersize[S4]
\setuphead[section][page=yes]

\starttext

\startsection[title={....}]
  ...
\stopsection

\startsection[title={....}]
  ...
\stopsection

\startsection[title={....}]
  ...
\stopsection

\placeoverviewpage

The \placeoverviewpage macro creates the overview slide. The easiest way to explain the output is through an example.


\setupinteraction[state=start]
% Minimalistic subsection layout
\setuppapersize[S4]
\setupwhitespace
\setuphead[section, subsection][color=blue, page=yes]
\setupbodyfont[dejavu, sans, 12pt]
\usemodule[overviewpage][level=section]
% Filler text
\startbuffer[subsection]
\startsubsection[title={Some random bullet points}]
\startitemize
\item First point
\item Second point
\item Third point
\item And so on \unknown
\stopitemize
\stopsubsection
\stopbuffer
\starttext
\startsection[title={First topic in the presentation}]
\startplacefigure[location={here,nonumber}, title={A cute cat}]
\externalfigure[http://placekitten.com/g/800/300][method=jpg, width=0.7\textwidth]
\stopplacefigure
According to the above layout, a topic consists of multiple subsections. Think of
it as a \type{section} in \type{beamer}.
\dorecurse{4}{\getbuffer[subsection]}
\stopsection
\startsection[title={Second topic in the presentation}]
\startplacefigure[location={here,nonumber}, title={A cute cat}]
\externalfigure[http://placekitten.com/g/800/350][method=jpg, width=0.7\textwidth]
\stopplacefigure
Another topic and some random stuff to explain about this topic.
\dorecurse{4}{\getbuffer[subsection]}
\stopsection
\startsection[title={Third topic in the presentation}]
\startplacefigure[location={here,nonumber}, title={A cute cat}]
\externalfigure[http://placekitten.com/g/800/380][method=jpg, width=0.7\textwidth]
\stopplacefigure
Another topic and some random stuff to explain about this topic.
\dorecurse{4}{\getbuffer[subsection]}
\stopsection
\startsection[title={Fourth topic in the presentation}]
\startplacefigure[location={here,nonumber}, title={A cute cat}]
\externalfigure[http://placekitten.com/g/800/325][method=jpg, width=0.7\textwidth]
\stopplacefigure
Another topic and some random stuff to explain about this topic.
\dorecurse{4}{\getbuffer[subsection]}
\stopsection
\placeoverviewpage
\stoptext

The final page of this file is:

Overview page created by the above codeClicking on one of the rectangles takes you to the corresponding page. Check out the complete PDF file.

The only caveat in using the module is that you can only create a thumbnail of a numbered head. Thus, unnumbered heads like subject, subsubject, etc. do not work. If you want to create a thumbnail of a unnumbered head, the best way is to declare it as a numbered head but do not display the number. For example:

\setuphead[section][numbercommand=\gobbleoneargument]

Please go and give the module a spin and let me know what you think.

If you don’t want to send plain text email, don’t include a text/plain part

I received an email from American Express, which read:

INTUNIRCUPD0007

Yep, that was it. Upon further inspection, the email was multi-part which started as follows:


This is a multi-part MIME message.

--CQVGJcnHtdDpdAC6J2fqf8Hxwe10nXgbL49ikw==
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
Content-Disposition: inline

INTUNIRCUPD0007
--CQVGJcnHtdDpdAC6J2fqf8Hxwe10nXgbL49ikw==
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: 8bit
Content-Disposition: inline

<?xml version="1.0" encoding="ISO-8859-1"?> 
... [100 odd lines omitted] ....

So, American Express wants to send me an HTML formatted email. Fine. But why include the one line plain text blurb? Just omit it and let me, and others like me who prefer plain text email deal with it. Had I not been expected this email, I might have just deleted it without looking at the HTML version of it. Sigh!

Those who know history find new ways to err

It is part of typesetting folklore that Donald Knuth was so upset on seeing the galley proofs of The Art of Computer Programming that he decided to write a computer program that followed the traditions of computer typesetting. And, hence, TeX was born.

For anyone who has to typeset math, TeX is the next best thing to sliced bread. I have been using TeX (in fact, LaTeX exclusively) to submit journal articles to IEEE journals. But it appears that in the race to become more adapted to online publishing, many journals are changing their publishing tools. And, in the case of IEEE, it appears that TeX is no longer in their tool chain.

I first noticed that something was amiss, when I was proof reading the galley proofs of a paper that was to appear in IEEE Transactions of Information Theory. I used \mathscr from the rsfs package to denote sets (as it is more ornate than \mathcal). The paper had an expression

\sum_{q \in \mathscr Q} ...

In the galley proof, the script Q in the subscript was of normalsize, rather than scriptsize. So, I pointed that out in my corrections. In the corrected version, the script Q was of a size between normalsize and scriptsize, which a comment from the typesetter that “we cannot make it smaller”. It is at that time that I realized that they were not using TeX. In fact, all mathematical symbols were inserted as images, which has two consequences:

  1. It throws accessibility out of the window. These days I read most papers on my iPad. It is a pain to annonate images (normal tools like highlight, underline, do not work with images).
  2. It increases the file size by a factor of 10. Just take a look at the average filesize for papers in the current issue (where the filesize is 2MB to 10MB) and early access papers (which are author preprints before they have been processed by the journal; for these the average filesize is 500-600kB).

Last month I went through another round of painful proof reading of galley proofs for a paper that is to appear in IEEE Transactions on Automatic Control. Again, all the math was converted to images, and the niceties of TeX, like the correct vertical location of : in \coloneqq were lost.

As an author, publishing in IEEE has become more painful for me (due to the typographic errors introduced by whatever software IEEE is using). As a reader, reading IEEE papers has become more painful for me (as I have to download papers that are 5x times bigger in size, and have math typeset as embedded images). Oh well, the irony of IEEE not accepting documents with Type 3 (bitmap) fonts is not lost on me.

This whole process makes me wonder, do we still need big publishers in this day and age of online publishing. Why should we, the authors that generate the content; the reviewers who verify it; the editors who carefully curtail it, offer our voluntary services to a big publisher like IEEE when we get nothing in return (no copy editing, no free open access). I wish I were in a field where I could publish exclusively with good publishers like SIAM and ACM (I have published once with SIAM; they actually copy edited the content and gave me the final TeX files when they were done), or in online journals like JMLR.

Creating a clean presentation style in 40 commits

Did you always want to learn ConTeXt, but did not know where to start? I have written a git-based tutorial that should help you get started.

The idea of the tutorial is to start with an empty document, and add features one-by-one. Each git commit corresponds to one small change in the document, and includes pointers to the documentation corresponding to that change. Continue reading

Separation of content and presentation for tables (part 1)

Separation of content and presentation is one of the selling points of TeX over word-processors. Strictly speaking, TeX is not superior compared to word-processors in this regard. It is possible to obtain a clean separation between content and presentation in word-processors (using styles) and it is possible to mix content and presentation in TeX code, as is illustrated by the following example from sample tex file for the IEEE Conference on Decision and Control:

\title{\LARGE \bf
Preparation of Papers for IEEE CSS Sponsored Conferences \& Symposia
}

(Seriously, how can anyone recommend writing TeX code like that!) In spite of the falseness of the argument, the general sentiment is true. It is much easier to write structured code (that separates content and presentation) in TeX than in word-processors. A testament to this is the ease with which one can convert a LaTeX document written in the style of one publisher house to that in the style of another publisher by simply changing the class file.

However, when it comes to tabular data, TeX, or rather LaTeX, is a mess. Simply browse through the questions tagged tables on TeX.SE if you don’t believe me. In this blog post, I want to argue that a clean separation between content and presentation is possible in TeX. The mess that is LaTeX tables is a limitation of LaTeX, and not of TeX. To illustrate this point, I’ll use ConTeXt and LuaTeX.

Lets start with a simple example.

which was typeset using the following code:

\bTABLE
    \bTR
        \bTD Course      \eTD
        \bTD Description \eTD
        \bTD Term Taught \eTD
        \bTD Enrollment  \eTD
    \eTR

    \bTR
        \bTD NAME 101 \eTD
        \bTD A description of the course that is typically one paragraph long.
             A description of the course that is typically one paragraph long.
             A description of the course that is typically one paragraph long. \eTD
        \bTD Fall 2010 \eTD
        \bTD 45 \eTD
    \eTR
    \bTR
        \bTD NAME 215 \eTD
        \bTD A description of the course that is typically one paragraph long.
             A description of the course that is typically one paragraph long.
             A description of the course that is typically one paragraph long. \eTD
        \bTD Winter 2011 \eTD
        \bTD 120 \eTD
    \eTR
    \bTR
        \bTD NAME 555 \eTD
        \bTD A description of the course that is typically one paragraph long.
             A description of the course that is typically one paragraph long.
             A description of the course that is typically one paragraph long. \eTD
        \bTD Fall 2012 \eTD
        \bTD 15 \eTD
    \eTR
\eTABLE

The ConTeXt interface is relatively clean. Rows are indicated by \bTR...\eTR and columns by \bTD...\eTD. The names of the command and the user interface is inspired from HTML tables.

So far, there is a clear separation between content and presentation, simply because we haven’t tweaked the presentation at all. Now suppose, I want to typeset the header as white on blue.

The clean way to add this achieve this is to define a new setup

 \startsetups table:header
 \setupTABLE[row][first][background=color, backgroundcolor=darkblue, color=white, style=bold]
 \stopsetups

and simply change the first line of the table to

\bTABLE[setups={table:header}]

Note that the presentation element (how to style the first row) is defined in the document preamble, and the setup can be shared in all the tables that need that particular style. Now, suppose that in addition to the header, we want to remove the vertical lines in the middle of the table.

Again, to achieve this, define a new setups as follows:

\startsetups table:frame
  \setupTABLE[frame=off]
  \setupTABLE[topframe=on,bottomframe=on]
  \setupTABLE[column][first][leftframe=on]
  \setupTABLE[column][last][rightframe=on]
\stopsetups

and add the setups table:frame to the first line of the table

\bTABLE[setups={table:header, table:frame}]

Continuing this way, suppose we want to change the alignment of cells, say vertically middle align the first column, horizontally middle align the third column, add hyphenation to the second column; and add some offset between the cells.

(I am not arguing that this is a good visual style; just using this as an example without making the use case too complicated). As before, we define a new setups

\startsetups table:style
  \setupTABLE[column][1][align={middle,lohi}]
  \setupTABLE[column][2][align={normal,hyphenated,verytolerant}]
  \setupTABLE[column][3][align=middle]
  \setupTABLE[loffset=1mm,roffset=1mm]
\stopsetups

and add the setups table:style to the first line of the table.

\bTABLE[setups={table:header, table:frame, table:style}]

See, separation of presentation and content need not be difficult in TeX. Let’s see if this approach is flexible to change. Suppose, I don’t like the vertical middle alignment of the first column. I can simply change the \setupTABLE[column][2][align=...] to my liking, and the change will be applied to all tables using the table:style setups. (Contrast this with what you need to do in LaTeX to achieve the same, and you’ll understand why LaTeX tables are considered hard.)

The above examples illustrate a simple example. In a future blog post, I’ll show how one can use Lua to simplify typesetting of complicated tables, while still maintaining a separation of content and presentation.

Announcing the visualcounter module

It has been almost two years since I posted about the main idea of the visualcounter module. I am happy to announce the official release of the module. I have been using this module in my presentations for almost two years without any problems, so I believe that it is stable enough to be released.

At present, the module is available on github and it should be available through ConTeXt garden soon. Look at the documentation to see some of the features of the module (in particular, the "star rating" example based on Jim Hefferon’s article in the Practex Journal).

The module provides six counters. Two of these were created for proof of concept and are not well tested; the remaining four—scratchcounter, mayanumbers, markers, and countdown—are well tested and, hopefully, their interface will not change.

This was my first module that uses the ConTeXt namespace macros. If you peek into the module, you’ll notice that I only define one macro; everything else is handled by the ConTeXt namespace macro \definenamespace.

The other interesting feature of this module is that I use a separate metapost instance for displaying the counters. This avoids conflict with user definitions. For example, if a user decides to change the metapost definition of fill for whatever reason,

\startMPdefinitions
let fill = draw;
\stopMPdefnition

such a change will not affect the visual counter module!

Any feedback is appreciated.

Removing multiple blank lines when typesetting code listings

The listings package in LaTeX has an option to collapse multiple empty lines into a single empty line when typesetting code lists. Today, there was a question on TeX.se how to do something similar when using the minted package. Since the vim module uses the same principle as the minted package, I wondered how one could collapse multiple empty lines into a single line?

One of the fetures of the vim module is that you can source an arbitrary vimrc file before processing the code through the vim editor to generate syntax highlighted code. This feature makes it possible to delegate the task to collapsing multiple blanks lines into a single blank line to vim, the editor. Since the vim module first writes the source code in a file with extension .tmp, the following vimrc snippet will collapse all multiple blank lines into a single blank line whenever a .tmp file is loaded:

au BufEnter *.tmp %s/\(^\s*\n\)\{2,\}/\r/ge | w

Use this inside the vim module as follows (example also available on github):

\usemodule[vim]

\startvimrc[name=collapse]
au BufEnter *.tmp %s/\(^\s*\n\)\{2,\}/\r/ge | w
\stopvimrc


\definevimtyping[CPPtyping][syntax=cpp, vimrc=collapse]

\starttext
\startCPPtyping
  i++;


  i++;






  i--;
\stopCPPtyping
\stoptext

Agreed, this is not as simple as the extralines=1 option in the listings package. But, it is not too complicated when you consider the fact that I had not thought about this feature at all when I wrote the vim module.

How I stopped worrying and started using Markdown like TeX

These days I type most of simple documents (short articles, blog entries, course notes) in markdown. Markdown provides only the basic structured elements (sections, emphasis, urls, lists, footnotes, syntax highlighting, simple tables and figures) which makes it easy to transform the input into multiple output formats. Most of the time, I still want PDF output and for that, I use pandoc to convert markdown to ConTeXt. At the same time, I have the peace of mind that if I need HTML or DOC output, I’ll be able to get that easily.

For most of the last decade, I have almost exclusively used LaTeX/ConTeXt for writing all my documents. After moving to Markdown, I miss three features of TeX: separation of content and presentation; conditional inclusion of content; and including external documents. In this post, I’ll explain how to get these with Markdown.

Separation of content and presentation

TeX gives you a lot of control for creating new structural elements. Let’s take a simple example. Suppose I want to write a file name in a document. Normally, I want the filename to appear in typewriter font. In LaTeX, I could type it as

\texttt{src/hello.c}

but it is better to define a custom macro \filename and use

\filename{src/hello.c}

The advantage is two-fold. Firstly, while writing the file, I am thinking in term of content (filename) rather than presentation (typewriter font). Secondly, in the future, if I want to change how a filename is displayed (perhaps as a hyper-link to the file), all I need to do is change the definition of the macro. Markdown, with its simplistic structure, lacks the ability to define custom macros.

Conditional compilation

TeX also makes it trivial to generate multiple versions of the document from the same source. Again, lets take an example. Suppose I am writing notes for a class. Normally, I like to include a short bullet list on my lecture slides, but include a detailed description in the lecture handout. In ConTeXt I can use modes as follows (LaTeX has a similar feature using the comments package):

Feature of the solution
\startitemize[n] 
   \item Feature 1 

     \startmode[handout] 
       Explanation of the feature ... 
     \stopmode 

   \item Feature 2 

     \startmode[handout]
       Explanation of the feature ... 
     \stopmode
\stopitemize

To generate the slides version of my lecture notes, I compile them using

context --mode=slides --result=slides <filename>

This version just contains the bullet list. Since the handout mode is not set, the content between \startmode[handout] ... \stopmode is omitted.

To generate the handout version of my lecture notes, I compile them using

context --mode=handout --result=slides <filename>

Since the handout mode is set, the content between \startmode[handout] ... \stopmode is included

Such a conditional compilation is extremely useful to keep the slides and handouts in sync. Again, markdown with its simplistic feature set, lacks the ability of conditional compilation. Neither does Pandoc add this feature.

Including external documents

TeX makes it easy to include external documents. This is really important when you want to include source code in your documents. I teach an introductory programming class, and want to make sure that the example code included in my notes is correct. I write the code in a separate file, write the corresponding test files to ensure that the code works correctly, and then include it in my notes using

\typeJAVAfile[src/FactoryExample.java]

which gives me syntax highlighted source code. Pandoc does generate syntax highlighted source code, but does not provide any means to include external source code. So, I have to copy paste the code from the actual source file to the markdown document, but that is an error-prone process.


If I only cared about PDF output (via LaTeX/ConTeXt backend), I could simply use the same TeX macros in the markdown document. Pandoc passes the TeX macros unchanged to the LaTeX/ConTeXt backend, so I would get a TeX document with all the bells and whistles. But, if I tried to generate HTML or DOC output, these TeX macros will be omitted, and I’d get a broken document. One of my reasons to switching to Markdown was the peace of mind that I can generate HTML or DOC output if needed. Using TeX macros in the source takes away that advantage.

So, I started looking for possible solutions and found gpp—the generic pre-processor. It is similar to the C-preprocessor (that handles the #include, #define, stuff in C/C++) but provides many configuration options. I use it with the -H option, which requires macros to be specified in an HTML-like mode:

<#include "file">
<#define MACRO|value>
Use <#MACRO>

Normally the <#...> does not appear in a document, so using gpp is safe.
See the gpp documentation for complete details. I’ll show how to get the three features that I miss from TeX:

  1. Separation of content and presentationWith gppI can define new macros that denote new structural elements, e.g.,
    <#define filename|`#1`>
    The source is included in <#filename src/hello.c>

    When I compile the document using gpp -H, I get

    The source is included in `src/hello.c`

    Sure, this requires more typing that simply using `...`, but that is the price that one has to pay for getting more structure. More importantly, I can define the #filename macro based on the output format:

    <#define filename|`#1`>
    <#ifdef HTML>
         <#define filename|<code class="filename">#1</code>> 
    <#endif>
    <#ifdef TEX> 
         <#define filename|\\filename{#1}> 
    <#endif> 
    The source is included in <#filename src/hello.c>

    Now, if I compile the document using gpp -H -DHTML=1, I get

    The source is included in <code class="filename">src/hello.c</code

    and if I compile using gpp -H -DTEX=1, I get

    The source is included in \filename{src/hello.c}

    This ensures that the document structure is passed to the output as well.

    To make it easy to manage macros, create three files, macros.gpp containing all macros, html.gpp overwriting some of the macros with HTML equivalents, and tex.gpp overwriting some of the macros with TeX equivalents. End macros.cpp file with

    ....
    <#ifdef HTML> 
        <#include "html.gpp"> 
    <#endif> 
    <#ifdef TEX> 
         <#include "tex.gpp"> 
    <#endif>

    and then preprocess the document using gpp -DTEX=1 --include macors.gpp <filename> (or -DHTML=1 for HTML output).

  2. Conditional compilationActually, the previous example already shows how to get conditional compilation: use the -D command line switch and check the variable definition using #ifdef. Thus, the above example translates to:
    Feature of the solution
    
    1. Feature 1 
    
    <#ifdef HANDOUT> 
    Explanation of the feature ... 
    <#endif> 
    
    2. Feature 2 
    
    <#ifdef HANDOUT> 
    Explanation of the feature ... 
    <#endif>

    When I compile without -DHANDOUT=1, I get the slides version; when I compile with -DHANDOUT-1, I get the handout version.

  3. Including external documentsExternal documents can be included with #includedirective. So, I can include an external file using
    ~~~ {.java} 
    <#include "src/Factory.java">
    ~~~

Putting it all together

All that is needed is to run the gpp preprocessor and then pass the output to pandoc.

gpp -H <options> <filename> | pandoc -f markdown -t <format> -o <outfile>

Hide this in a wrapper script or a shell function or a Makefile, and you have a markdown processor with the important features of TeX!