Tags
A picture is worth a thousand words so goes the cliché. The popularity of word clouds is a testament to this “Chinese proverb” One such word cloud is at the bottom left of this blog. Now, what if you want to include a word cloud in a TeX document, as asked on tex.stackexchage. Here is the external filter module to the rescue, again..
First, we need a word cloud layout engine. I chose the IBM Word Cloud, the engine by Jonathan Feinberg that powers Wordle. To use this engine, install Java on your computer using your favorite method. Then, download the engine from IBM’s website (you need to fill out a silly registration form, sigh). Unzip the file in an appropriate directory (for simplicity, I removed the spaces from the directory name and moved everything to IBM-Word-Cloud). Edit the name of the font in examples/configuration.txt file. Run run-example.sh (or run-example.bat) file to make sure that everything works correctly.
Once this engine is working, getting word clouds in ConTeXt is easy. Download the externalfilter module. Then,
\usemodule[filter]
\defineexternalfilter
[wordcloud]
[filtercommand=/opt/java/jre/bin/java -jar $HOME/IBM-Word-Cloud/ibm-word-cloud.jar
-c $HOME/IBM-Word-Cloud/examples/configuration.txt
-w 800 -h 600
-o \externalfilteroutputfile\space
-i \externalfilterinputfile,
output=\externalfilterbasefile.png,
readcommand=\ExternalFigure,
continue=yes,
]
\def\ExternalFigure#1{\externalfigure[#1]}
This creates an environment \startwordcloud...\stopwordcloud that stores its contents in an external file, runs that file through ibm-word-cloud.jar and includes the result.
Using this one the famous quote by Knuth gives
\starttext \startwordcloud Thus, I came to the conclusion that the designer of a new system must not only be the implementer and first large||scale user; the designer should also write the first user manual. The separation of any of these four components would have hurt \TeX\ significantly. If I had not participated fully in all these activities, literally hundreds of improvements would never have been made, because I would never have thought of them or perceived why they were important. But a system cannot be successful if it is too strongly influenced by a single person. Once the initial design is complete and fairly robust, the real test begins as people with many different viewpoints undertake their own experiments. \stopwordcloud \stoptext

Typo in title?
Thanks for pointing that out. Where is a spell-checker when you need one?
externalfilter module sounds good. We will discuss at Brejlov.
Excellent! Just came across Wordle yesterday and spent evening and morning playing with it. And here it is for ConTeXt! Many thanks!
Small addition. Another thing you may want to comment out (or configure properly) in the configuration.txt is the stopwordfile. Either comment it out or make sure it gets found. Otherwise generation fails without any useful message.
GReat! Thanks alot for this. Ill make sure to include it on my page in the near future!!
Seems like the jar-Download is gone. What to do with my IBM-profile now?
Does anyone know where to get the jar file from?
The link given by wayback machine (http://web.archive.org/web/20110105033434/http://www.alphaworks.ibm.com/tech/wordcloud/download) still works. Grab a copy before it disappears!