Using external filters

Markdown is a light weight markup language that is inspired by the formatting style of emails. As such, writing text in markdown feels very natural. For some time I was wondering, wouldn’t it be great if I could write markdown directly in ConTeXt. Markdown support can be added in two ways:

  • write a markdown parser in TeX (no thanks) or in Lua (or use one that exists)
  • use an external program (like pandoc) to parse markdown and generate ConTeXt output

I am no fan of reinventing the wheel, so I decided to take the second route but make it more general than markdown. This is an announcement of a module that can run an arbitrary external program over a bunch of text. The module is at a preliminary stage, and I do plan to add more features in the future. But, I think that even in its current form, it will be useful to others. Download the module and place it in your current directory. To use it, simply say

\usemodule[filter]

An external filter for markdown is defined as follows:

\defineexternalfilter
  [markdown]
  [filter={pandoc -w context -o \externalfilteroutputfile}]

The filter should be a program that writes it output to a file\externalfilteroutputfile: the name of this file can be changed using

\setupexternalfilter
  [markdown]
  [output={...}]

It defaults to a sane value, so I suggest that you do not change it. The \defineexternalfilter command creates a \startmarkdown ... \stopmarkdown environment that can be used as such:

\startmarkdown
## Lists ##

Unordered (bulleted) lists use asterisks, pluses, and hyphens (`*`,
`+`, and `-`) as list markers. These three markers are
interchangeable. As an example, this:

    *   Candy.
    *   Gum.
    *   Booze.

gives:

*   Candy.
*   Gum.
*   Booze.
\stopmarkdown

which gives the following: Not bad for 50 lines of code! The module has some other bells and whistles, but I have not finalized the user-interface yet. More on this module later.

Edit: Modified code so that it works with the latest version of the module.

29 thoughts on “Using external filters

  1. An alternative approach is to use org-mode in Emacs which can generate LaTeX. I use org-mode almost exclusively for writing articles etc. these days. Of course, if you don’t use Emacs…

    • I don’t use Emacs….vim keystrokes are ingrained too strongly in my finger memory. I have tried to use markdown, RST, asciidoc, and a few other formats, and found that I need more features than these things provide: automatic numbering of theorems, cross referencing, etc. I will check out org mode and see how it compares.

      • It’s always the tradeoff with these things, isn’t it? We want simple elegant markup, but of course we need our favorite features present as well, and for some of us, those can’t-miss features are orthagonal to the simplicity that we want.

        Org-mode doesn’t have a way of hopscotching over that trouble entirely, but it does have a way to sneak in certain kinds of complexity without too much ugliness. Like pandoc, it passes through arbitrary LaTeX commands. This is unlike Markdown and some of the other minimal markup languages that are friendly to arbitrary HTML, but don’t grok LaTeX.

        Arbitrary LaTeX environments are also possible using the user-contributed package org-special-blocks, but I haven’t tried it.

        As a markup language org-mode attempts to enable some complexity that others don’t (support for tables, for example), and has some weaknesses (adding multiple character-level markups to the same span of text).

        What org-mode does well is the integration of an outlining/structure-editing environment with a minimal markup language, with the addition of several mechanisms that are useful for personal process management (tags, process-oriented keywords, etc.) And this stuff is pretty magic. As an outliner, it’s great. I left a 10 year relationship with vim for org + Emacs, and am writing my thesis in it.

        • I looked at Org-mode and it is very interesting. I will give it a shot…not for authoring but first as a replacement for zim (notebook) + remind (calendar).

  2. This is great! Glad you use a common markup language (markdown) and not invent your own. That way I can re use existing texts and (more important) don’t have to learn yet another markup language. Thanks!

    • Pandoc does not support textile. Are there any textile to context converters? If not, even a textile to html converted could be made to work.

        • If you have some xml setup to parse html (like described in Thomas’s My Way), you could use

          readcommand={\groupdedcommand{\xmlprocessfile{xhtml}{#1}{}}{}}

          If you are lazy like me, you can use pandoc to convert the html to context for you, like

          
          \defineexternalfilter
            [textile]
            [filtercommand={redcloth \externalfilterinputfile | pandoc -r html -w context -o \externalfilteroutputfile}]
          

          (Not sure how wordpress will handle the formatting in comments. Here is a complete test file).

          • Works great. Now, if I only could write a shell script that could take as an argument marked-up file instead of having to insert it manually each time into a ConTeXt file. Somehow couldn’t figure out how to include external file between \starttextile and \stoptextile.

                • Sorry. A recent refactoring introduced a bug in \processcommand. It should now work correctly. See the test file tests/textile-external.tex on the github page.

              • Here’s my quick and dirty solution in one shell script file:

                #!/bin/bash
                echo "Typesetting: $1"
                Suffix=`echo $1|sed 's/^.*\.//'`
                FileName=`basename $1 .$Suffix`
                (
                cat < tmp.tex
                cat $1 >> tmp.tex
                (
                cat <> tmp.tex
                context  --batchmode tmp.tex
                mv tmp.pdf $FileName.pdf
                echo "Result: $FileName.pdf"
                
                • Well, well. WordPress doesn’t like here-documents. Too bad.

                  #!/bin/bash
                  echo “Typesetting: $1”
                  Suffix=`echo $1|sed ‘s/^.*\.//’`
                  FileName=`basename $1 .$Suffix`
                  (
                  cat < tmp.tex
                  cat $1 >> tmp.tex
                  (
                  cat <> tmp.tex
                  context –batchmode tmp.tex
                  mv tmp.pdf $FileName.pdf
                  echo “Result: $FileName.pdf”

                • Of course redirections got totally messed up. First here-document is just the first part of your example ConTeXt file up to \starttextile and the second is everything else from \stoptextile. And all this is written to tmp.tex.

                  • Ah, that makes sense. If the \processtexttile file is working for you, then it should be possible to format external textile documents using a custom module so that you can just say

                    context –usemodule=textile

                    • This should probably work

                      
                      \startmodule textile
                      
                      \usemodule[filter]
                      \defineexternalfilter[textile][...]
                      
                      \starttext 
                      \processtextilefile[\inputfilename]
                      \stoptext
                      \stopmodule
                      
                    • Hi Piotr,

                      Since you are the only one using this feature right now, I want to warn you in advance that I am changing the syntax for processing external files. In the next version, it will be \process<filter>file{...} with the file name in curly brackets rather than square bracket. This is implemented in the dev branch on github. I will release the new version once I have documented some new features.

  3. Was testing your module. No luck. following in the pdf where produced.

    “TODO: File test-externalfilter-markdown.tex not found! Check your definition”

    In the terminal it works fine but in “smultron” i have following settings in my “typeset” script.


    #!/bin/sh
    cd /Applications/context/
    . /Applications/context/tex/setuptex /Applications/context/tex
    cd %%d
    context %%p

    where %%d current working directory and %%p current document(full path)

    so maybe its a path issue to make it work with smultron and osx system.

    Regards

    /Janneman

    On a DELL inspiron-2200 OSX86 leopard

  4. Just added the “/usr/local/bin” path where pandoc lives in the “setuptex” script. Now it works fine.

    Maybe it is a “dirty” solution if i intend to upgrade the minimals one day. 😉

    /Janneman

    • If /usr/local/bin is in PATH, then setuptex should not remove it from the path. I do not know why that happens in your case.

      Another option is to use /usr/local/bin/pandoc in \defineexternalfilter, but that makes your source less portable.

      • You could try

        source /Applications/context/tex/setupex
        

        instead of

        . /Applications/context/tex/setupte
        

        I do not understand the difference between the two, but sometimes one works better than the other.

  5. Pingback: Word clouds in ConTeXt « Random Determinism

  6. Very good website you have here but I was wanting to know if you knew of
    any message boards that cover the same topics
    talked about in this article? I’d really love to be a part of online community where I can get feed-back from other experienced individuals that share the same interest. If you have any suggestions, please let me know. Bless you!

Leave a comment