Syntax highlighting engines: clean tex output

The vim module uses the vim editor to syntax highlight code snippets in ConTeXt. I thought that it should be straight forward to support other syntax highlighting engines: source-highlight, pygments, HsColor, etc. Unfortunately, that is not the case. None of these syntax highlighting engines were written with reuse in mind.

For example, consider a simple tex file:

This is an \important{important} text

Lets compare the tex file generated by various syntax highlighters:

source-highlight -f latex gives

% Generator: GNU source-highlight, by Lorenzo Bettini,
\mbox{}\textbf{\textcolor{Blue}{\textbackslash{}definestartstop}}\textcolor{Purple}{[important]} \\

\mbox{}\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ [color=red, \\

\mbox{}\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ style=\textbf{\textcolor{Blue}{\textbackslash{}italic}}] \\

\mbox{}\textbf{\textcolor{Blue}{\textbackslash{}starttext}} \\

\mbox{}This\ is\ an\ \textbf{\textcolor{Blue}{\textbackslash{}important}}\textcolor{ForestGreen}{\{important\}}\ text \\

\mbox{}\textbf{\textcolor{Blue}{\textbackslash{}stoptext}} \\


pygmentize -f latex gives

This is an \PY{k}{\PYZbs{}important}\PY{n+nb}{\PYZob{}}important\PY{n+nb}{\PYZcb{}} text

HsColor-latex -partial gives

\hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \textcolor{red}{[}{\rm{}color}\textcolor{red}{=}{\rm{}red}\textcolor{cyan}{,}\\
\hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace \hsspace {\rm{}style}\textcolor{cyan}{=$\backslash$}{\rm{}italic}\textcolor{red}{]}\\
{\rm{}This}\hsspace {\rm{}is}\hsspace {\rm{}an}\hsspace \textcolor{red}{$\backslash$}{\rm{}important}\textcolor{cyan}{\{}{\rm{}important}\textcolor{cyan}{\}}\hsspace {\rm{}text}\\

HsColor and source-highlight use explicit LaTeX commands for spacing and formatting. Ouch! Pygments uses logical markup, but with cryptic command names. But, from the point of view of using pygments output in ConTeXt, the \begin{Verbatim} and \end{Verbatim} are show stopper. (OK, not really. It can be bypassed with some effort).

Based on my experience, I decided to clean up the output generated by 2context.vim:

This is an \SYN[Identifier]{\\important}\{important\} text

I assume only four TeX commands to be defined: \\, \{, and \} for backslash, open brace, and close brace; and \SYN[...]{...} for syntax highlighting. Thus, if anyone wants to reuse 2context in plain TeX or LaTeX, or a yet to be written future macro package, they would not need to modify the output at all. I wish the other syntax highlighting programs did the same.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s