Update for the filter module: faster caching

Over the last year, the code base of the filter module has matured considerably. Now, the module has all the features that I wanted when I started with it about a year and a half back. The last remaining limitation (in my eyes, at least) was that caching of results required a call to external programs (mtxrun) to calculate md5 hashes; as such, caching was slow. That is no longer the case. Now (since early December), md5 sums are calculated at the lua end, so there is no time penalty for caching. As a result, in MkIV, recompiling is much faster for documents having lots of external filter environments with caching enabled(i.e., environments defined with continue=yes option).

[Continue Reading]

Some thoughts on lowering the learning curve for using TeX (part I)

TeX has a steep learning curve. Often times, steeper than it needs to be. Take, for example, the special characters in TeX. Almost every introduction to plain TeX, eplain, LaTeX, or ConTeXt has a section on these special characters

\ { } $ & # ^ _ & ~

A good introduction then goes on to explain why these special characters are important; sometimes dropping a hint about category codes. I feel that these details are useless and, at the user level, we should get rid of them.

[Continue Reading]

HTML export

The question of translating TeX to (X)HTML arises frequently. Almost everyone wants it. After all, on the web, (X)HTML is the de-facto standard markup; PDF, with all its hyper link abilities, is clumsier to use. On the other hand, for print, PDF, especially, TeX generated PDF is the de-facto standard (at least for math heavy fields); (X)HTML, with all its print css abilities, is clumsier to use. Often, you want both an (X)HTML version and a PDF version of a document. With the popularity of eink devices, epub (which is essentially a zipped (X)HTML file) is also becoming popular. Generating these multiple output formats from the same source is tricky.

Continue reading

Reading files off the web

For some time now, ConTeXt can read files directly off the web. So, you can say


\starttext
\component {https://docs.google.com/MiscCommands?command=saveasdoc&docID=0AbXLuM4SVI8PZGc2Y2QzamhfOWNzM3Z3aGM5&exportFormat=txt}
\stoptext

process it with context. ConTeXt will figure out the file name is a url, use curl to download it and read the downloaded file. On subsequent run, ConTeXt just checks if a file with the same name exists, and just uses that.

I am thinking of using this for collaborative editing. Need to figure out how to force ConTeXt to redownload the file.

Atypical text and math fonts for presentations

Now-a-days, almost everyone uses beamer for presentations (at least, the math heavy presentations that I tend see at conferences). Beamer is great; it provides a variety of styles, and all of them look professional. However, all of them use the same boring font — Computer Modern Sans Serif.

I want my presentations to look different. So, I have been looking at math fonts that look good for presentations. Stephen Harke has an excellent survey of free math fonts for TeX. Over time, I have cycled through most of them for my presentations and they were getting boring. So, I started looking for non-standard combinations.

The new TeX engines, XeTeX and luaTeX, make it easy to use OpenType fonts, giving rise to new options to try. Here are two combinations that I like.

GFS Artemisia with Euler

Euler by Hermann Zapf is one of the most elegant math fonts. It was designed to resemble the handwriting of a mathematician. Zapf created Euler to match the Concrete font which was used by Knuth in his book Concrete Mathematics. Although Concrete is a pleasant looking font, I find it too typewriterish for presentations. Euler also matches with two other excellent fonts, Palatino and Charter. But, for presentations I find both of them to be a bit too formal. A font that matches Euler in character without looking too formal or too arcane is GFS Artemisia. Here is how they look together.

Artemisia with Euler

Artemisia with Euler

In ConTeXt, the typescripts module by Wolfgang Schuster provides support for Artemisia (among many other fonts). Using that, the typescript for using Artemisia with Euler is really simple


\usetypescriptfile[type-gfs]

\usetypescript[artemisia]

\starttypescript[mainface]

\definetypeface   [mainface][rm][serif][artemisia]    [default]

\definetypeface   [mainface][ss][sans] [modern]       [default]

\definetypeface   [mainface][mm][math] [euler]        [default]   [text=rm]

\definetypeface   [mainface][tt][mono] [modern-vari]  [default]

\stoptypescript

\usetypescript[mainface]

\setupbodyfont[mainface]

The above image was generated by using ConTeXt MkII with XeTeX engine. Euler does not work with the current version of ConTeXt MkIV as ConTeXt MkIV does not contain a virtual font for Euler. I plan to get a working solution for Euler in MkIV soon.

Delicious with Cambria Math

Delicious by Jos Buivenga is a nice looking Sans Serif font. The slight zig-zag shape makes it look a bit informal, but not amateur. And it works surprising well with Cambria. Cambria is not free, but comes with Windows Vista and Microsoft Office. Here is how they look together.

Delicious with Cambria

Delicious with Cambria

Cambria is an OpenType Math font can only be used with luaTeX engine. ConTeXt comes with a typescript for Cambria, and Wolfgang’s typescript module provides support for Delicious. Here is the typescript for using Delicious with Cambria.


\usetypescriptfile[type-exljbris]

\usetypescript[delicious]

\starttypescript[mainface]

\definetypeface   [mainface][rm][serif][modern]       [default]

\definetypeface   [mainface][ss][sans] [delicious]    [default]

\definetypeface   [mainface][mm][math] [cambria]      [default, text=ss]

\definetypeface   [mainface][tt][mono] [modern-vari]  [default]

\stoptypescript

\usetypescript[mainface]

\setupbodyfont[mainface,ss]

(Footnote: The integral sign in Cambria looks funny. This is because OpenType math specification does not fit with the TeX model completely. See this thread for details.)

quote, backtick and TeX engines that handle unicode fonts

I really dislike TeX’s way of adding quotes: `quote' becomes ‘quote’ while ``double quote'' becomes “double quote”. These quotes look funny in a text editor. Entering proper Unicode quotes is easy (in vim, use digraphs '6 and '9 for single quotes and "6 and "9 for double quotes). Even if you want to use Unicode files, at the very least, these quotes can be hidden behind macros. ConTeXt uses \quote and \quotation (which provides language dependent quotes and adapts to nesting). I am sure that there are LaTeX packages that provide the same functionality. It may take a few more keystrokes to enter, but I think that Unicode input or macros is the correct way to deal with special characters, rather than ad-hoc abbreviations (Yes, I do not like TeX’s way of doing accented letters also.)

A bigger problem with these intelligent quotes is that they are also applied when not needed — in source code listings. I have seen this in many books and tutorials. Nothing shouts TeX louder than a wrong quote in source listings. Consider, for example, a simple ruby program that uses

str = 'ab\tc'

TeX (plain TeX, LaTeX, and ConTeXt MkII) will typeset this as

Quotes in MkII

TeX is being too smart. From what I have been told, the problem is with the font files. As a user, I expect the macro package to take care of such things. But, by default, both LaTeX and ConTeXt MkII show the wrong quotes, as in the above image. ConTeXt MkIV, however, does the right thing, giving
Quotes in MkIV

I am guessing that this is because ConTeXt MkIV assumes everything is Unicode, and does not enable texquotes feature for the font used for code listings. XeLatex also does the right thing. If you use a Unicode font, it does not enable texquotes. (I do not really know how to enable texquotes in XeLaTeX. I guess it should be possible to enable these features for serif and sans fonts, but not for the mono font.) So, even to users like me who primarily use TeX for English documents, an engine that works with Unicode fonts is useful.

TeX Programming: The past, the present, and the future

There was an interesting thread on ConTeXt mailing list which I am summarizing in this post. To make the post interesting, I changed the problem slighty. So, the solutions posted here were not part of the thread, but are in the same spirit.

Suppose you want to typeset (in ConTeXt) all possible sum of roll of two die. Something like this:

table

One way to do this will be to type the whole thing by hand:

\bTABLE
    \bTR \bTD $(+)$ \eTD \bTD 1 \eTD \bTD 2 \eTD \bTD 3 \eTD \bTD 4  \eTD \bTD 5  \eTD \bTD 6  \eTD \eTR
    \bTR \bTD 1     \eTD \bTD 2 \eTD \bTD 3 \eTD \bTD 4 \eTD \bTD 5  \eTD \bTD 6  \eTD \bTD 7  \eTD \eTR
    \bTR \bTD 2     \eTD \bTD 3 \eTD \bTD 4 \eTD \bTD 5 \eTD \bTD 6  \eTD \bTD 7  \eTD \bTD 8  \eTD \eTR
    \bTR \bTD 3     \eTD \bTD 4 \eTD \bTD 5 \eTD \bTD 6 \eTD \bTD 7  \eTD \bTD 8  \eTD \bTD 9  \eTD \eTR
    \bTR \bTD 4     \eTD \bTD 5 \eTD \bTD 6 \eTD \bTD 7 \eTD \bTD 8  \eTD \bTD 9  \eTD \bTD 10 \eTD \eTR
    \bTR \bTD 5     \eTD \bTD 6 \eTD \bTD 7 \eTD \bTD 8 \eTD \bTD 9  \eTD \bTD 10 \eTD \bTD 11 \eTD \eTR
    \bTR \bTD 6     \eTD \bTD 7 \eTD \bTD 8 \eTD \bTD 9 \eTD \bTD 10 \eTD \bTD 11 \eTD \bTD 12 \eTD \eTR
\eTABLE

I am using Natural Tables since it is very easy to change their output. For example, to get the effect shown in the above figure, I can just add

\setupTABLE[each][each][width=2em,height=2em,align={middle,middle}]
\setupTABLE[r][1][background=color,backgroundcolor=gray]
\setupTABLE[c][1][background=color,backgroundcolor=gray]

But that is not the point of this post. Typing everything by hand is error prone, and non-reusable. I want to show how to automate the above task. In any ordinary programming language we would do this as (pseudo code)

"start table"
  "start table row"
      "table element: (+)"
      for y in [1..6] do
          "table element: #{y}"}
  "stop table row"
  for x in [1..6] do
      "start table row"
          "table element: #{x}"
          for y in [1..6] do
              "table element #{x+y}"
          end
      "stop table row"
  end
"stop table"

Unfortunately, TeX is no ordinary programming language. The first thing that comes to mind is to use ConTeXt’s equivalent of a for loop\dorecurse

\bTABLE
    \bTR
    \bTD $(+)$ \eTD
    \dorecurse{6}
        {\bTD \recurselevel \eTD}
    \eTR
    \dorecurse{6}
      {\bTR
          \bTD \recurselevel \eTD
          \edef\firstrecurselevel{\recurselevel}
          \dorecurse{6}
            {\bTD \the\numexpr\firstrecurselevel+\recurselevel \eTD}
      \eTR}
    \eTABLE

This however does not work as expected because \dorecuse is not fully expandable. One way to get around this problem is to expand the appropriate parts of the body of \dorecurse

\bTABLE
  \bTR
  \bTD $(+)$ \eTD
  \dorecurse{6}
    {\expandafter \bTD \recurselevel \eTD}
  \eTR
  \dorecurse{6}
    {\bTR
        \edef\firstrecurselevel{\recurselevel}
        \expandafter\bTD \recurselevel \eTD
        \dorecurse{6}
        {\expandafter\expandafter\expandafter
         \bTD
            \expandafter\expandafter\expandafter
            \the\expandafter\expandafter\expandafter
            \numexpr\expandafter\firstrecurselevel\expandafter
            +%
            \recurselevel
        \eTD}
      \eTR}
\eTABLE

Behold! All those \expandafters. The reason they are needed was succinctly explained by David Kastrup in his TeX interview

Instead, macros are used as a substitute for programming. TeX’s macro expansion language is the only way to implement conditionals and loops, but the corresponding control variables can’t be influenced by macro expansion (TeX’s “mouth” in Knuth’s terminology). Instead assignments must be executed by the back end (TeX’s “stomach”). Stomach and mouth execute at different times and independently from one another. But it is not possible to solve nontrivial programming tasks with either: only the unholy chimera made from both can solve serious problems. eTeX gives the mouth a few more teeth and changes some of that, but the changes are not really fundamental: expansion still makes no assignments.

Once you get the hang of it, adding all those \expandafters is “simple” (Taco Hoekwater in a post)

The trick to \expandafter is that you (normally) write it backwards until reaching a moment in time where TeX is not scanning an argument.

Say you have a macro that contains some stuff in it to be typeset by \type:

  \def\mystuff{Some literal stuff}

Then you begin with

  \type{\mystuff}

but that obviously doesn’t work, you want the final input to look like

  \type{Some literal stuff}

Since \expandafter expands the token that follows after next token — whatever the next token is — you have to insert it backwards across the opening brace of the argument, like so:

  \type\expandafter{\mystuff}

But this wouldn’t work, yet: you are still in the middle of an expression (the \type expects an argument, and it gets \expandafter as it stands).

Luckily, \expandafter itself is an expandable command, so you jump back once more and insert another one:

  \expandafter\type\expandafter{\mystuff}

Now you are on ‘neutral ground’, and can stop backtracking. Easy, once you get the hang of it.

Fortunately, in ConTeXt you do not need to do all this mental arithmetic. ConTeXt provides a command \expanded which expands its argument. It only works if the expanded code does not try to scan the next character. In this case, \bTD\eTD can be included in \expanded, while \bTR\eTR cannot. So we end up with:

\bTABLE
  \bTR
    \bTD $(+)$ \eTD
    \dorecurse{6}
        {\expanded{\bTD \recurselevel \eTD}}
  \eTR
  \dorecurse{6}
      {\bTR
          \expanded{\bTD \recurselevel \eTD}
          \edef\firstrecurselevel{\recurselevel}
          \dorecurse{6}
          {\expanded{\bTD \the\numexpr\firstrecurselevel+\recurselevel\relax \eTD}}
      \eTR}
\eTABLE

Wolfgang Schuster posted a much neater solution.

\bTABLE
  \bTR
  \bTD $(+)$ \eTD
  \dorecurse{6}
      {\bTD #1 \eTD}
  \eTR
  \dorecurse{6}
      {\bTR
      \bTD #1 \eTD
      \dorecurse{6}
          {\bTD \the\numexpr#1+##1 \eTD}
      \eTR}
\eTABLE

This makes TeX appear like a normal programming language. But only TeX wizards like Wolfgang can discover such solutions. You need to know the TeX digestive system inside out to even attempt something like this. Inspired by Wolfgang’s solution, I tried the same thing with ConTeXt’s lesser known for loops

\bTABLE
  \bTR
      \bTD $(+)$ \eTD
      \for \y=1 \to 6 \step 1 \do
          {\bTD #1 \eTD}
  \eTR
  \for \x=1 \to 6 \step 1 \do
    {\bTR
        \bTD #1 \eTD
        \for \y=1 \to 6 \step 1 \do
            {\bTD \the\numexpr#1+##1 \eTD}
     \eTR}
\eTABLE

Don’t worry if you don’t understand how the above works. With LuaTeX, even normal users have hope. Luigi Scarso posted the following code:

\startluacode
tprint = function(s) tex.sprint(tex.ctxcatcodes,s) end
tprint('\\bTABLE')
  tprint('\\bTR')
    tprint('\\bTD $(+)$ \\eTD')
    for y = 1,6 do
        tprint('\\bTD ' .. y .. '\\eTD')
    end
  tprint('\\eTR')
  for x = 1,6 do
      tprint('\\bTR')
      tprint('\\bTD ' .. x .. '\\eTD')
      for y = 1,6 do
          tprint('\\bTD' .. x+y .. '\\eTD')
      end
      tprint('\\eTR')
  end
tprint('\\eTABLE')
\stopluacode

Finally, with LuaTeX, we can implement simple algorithms in a simple way inside TeX. In this case, the pure TeX solution using \dorecurse wasn’t too difficult. But try to come up with a pure TeX solution that prints the average of the numbers.

Here is a hint. Convert the numbers to dimensions by multiplying by 1pt, do averaging using \dimexpr, then get rid of the point using \withoutpt and hope that the fixed precision mathematics in TeX did not mess things up.