Word clouds in ConTeXt

A picture is worth a thousand words so goes the cliché. The popularity of word clouds is a testament to this “Chinese proverb” One such word cloud is at the bottom left of this blog. Now, what if you want to include a word cloud in a TeX document, as asked on tex.stackexchage. Here is the external filter module to the rescue, again..

First, we need a word cloud layout engine. I chose the IBM Word Cloud, the engine by Jonathan Feinberg that powers Wordle. To use this engine, install Java on your computer using your favorite method. Then, download the engine from IBM’s website (you need to fill out a silly registration form, sigh). Unzip the file in an appropriate directory (for simplicity, I removed the spaces from the directory name and moved everything to IBM-Word-Cloud). Edit the name of the font in examples/configuration.txt file. Run run-example.sh (or run-example.bat) file to make sure that everything works correctly.

Once this engine is working, getting word clouds in ConTeXt is easy. Download the externalfilter module. Then,

\usemodule[filter]

\defineexternalfilter
  [wordcloud]
  [filtercommand=/opt/java/jre/bin/java -jar $HOME/IBM-Word-Cloud/ibm-word-cloud.jar 
    -c $HOME/IBM-Word-Cloud/examples/configuration.txt 
    -w 800 -h 600
    -o \externalfilteroutputfile\space
    -i \externalfilterinputfile,
  output=\externalfilterbasefile.png,
  readcommand=\ExternalFigure,
  continue=yes,
  ]

\def\ExternalFigure#1{\externalfigure[#1]}

This creates an environment \startwordcloud...\stopwordcloud that stores its contents in an external file, runs that file through ibm-word-cloud.jar and includes the result.

Using this one the famous quote by Knuth gives

\starttext

\startwordcloud
Thus, I came to the conclusion that the designer of a new
system must not only be the implementer and first
large||scale user; the designer should also write the first
user manual.

The separation of any of these four components would have
hurt \TeX\ significantly. If I had not participated fully in
all these activities, literally hundreds of improvements
would never have been made, because I would never have
thought of them or perceived why they were important.

But a system cannot be successful if it is too strongly
influenced by a single person. Once the initial design is
complete and fairly robust, the real test begins as people
with many different viewpoints undertake their own
experiments. 
\stopwordcloud
\stoptext

knuth's quote

Advertisements

9 thoughts on “Word clouds in ConTeXt

    • Small addition. Another thing you may want to comment out (or configure properly) in the configuration.txt is the stopwordfile. Either comment it out or make sure it gets found. Otherwise generation fails without any useful message.

  1. Pingback: Wordle-like word clouds | XL-UAT

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s