A ConTeXt style file for formatting RSS feeds for Kindle

As I said in the last post, I bought an Amazon Kindle Touch sometime back, and I find it very useful for reading on the bus/train while commuting to work.I use it read novels and books, a few newspapers and magazines that I subscribe to, and RSS feeds of different blogs that I follow. Until now, I had been using ifttt to send RSS feeds to Instapaper; Instapaper then emails a daily digest as an ebook to kindle account at midnight; in the morning, I switch on my Kindle for a minute; the Kindle syncs new content over Wifi; and off I go.

However, Kindle typesets ebooks very poorly,  so I decided to write a ConTeXt style file to typeset RSS feed (check it out on github).  To use this style:

\usemodule[rssfeed]

\starttext
\setvariables
    [rssfeed]
    [title={Title of the feed},
     description={Description of the feed},
     link={A link to the feed},
    ]

\starttitle[title={First feed entry}]
....
\stopttitle

\starttitle[title={Second feed entry}]
...
\stoptitle

\stoptext

It uses the eink module to set the page layout and fonts, and use a light and clean style for formatting feed entries. Since the proof is in the pudding, look at following PDFs to see the style for different types of blogs.

I use a simple ruby script to parse RSS feeds and uses Pandoc to convert the contents of each entry to ConTeXt. The script is without bells and whistles, and there is no site specific formatting of feeds. All feeds are handles the same way, and as a result, there are a few glitches: For example, IEEE uses some non-standard tags to denote math) which Pandoc doesn’t handle and the images generated by WordPress blogs that use $latex=...$ to typeset math are not handled correctly by ConTeXt, etc.

The script also uses Mutt to email the generated PDF to my Kindle account. This way, I can simply add a cron job that runs the script at appropriate frequency (daily for usual blogs, weekly for low traffic blogs, and once a month for table of contents of different journals).

Advertisements

Don’t complicate your code

A friend sent me a link to a blog post that discusses solutions to the Josephus problem in Perl, Ruby, and Python. Ouch! Those implementations hurt.

The Josephus problem is a simple discrete math puzzle:

Flavius Josephus was a roman historian of Jewish origin. During the Jewish-Roman wars of the first century AD, he was in a cave with fellow soldiers, 40 men in all, surrounded by enemy Roman troops. They decided to commit suicide by standing in a ring and counting off each third man. Each man so designated was to commit suicide…Josephus, not wanting to die, managed to place himself in the position of the last survivor.

In the general version of the problem, there are n soldiers numbered from 1 to n and each k-th soldier will be eliminated. The count starts from the first soldier. What is the number of the last survivor?

Keep reading

Programming in Maxima

Recently, in one of my papers I needed to do somewhat involved polynomial algebra. Basically, adding and substracting polynomials in p and 1-p with degree n. The calculations were relatively straight forward, but I always make silly mistakes with such calculations. So, I decided to check my analysis using a computer algebra system. Matlab and Mathematica, perhaps the most popular computer algebra softwares (although Matlab is focussed more on numerical calculations), were out of the picture because they are commercial and I did not have site licenses for them. So, I googled a bit and settled on Maxima.

To get started, I used wxMaxima, which provides a nice GUI for Maxima. wxMaxima makes learning maxima easy. It works as I expect an CAS IDE to work. After playing around the GUI for a while, I wanted to write a script to check my calculations. It turns out that writing a script is not that easy.

You can create a wxMaxima document, which is somewhat like a Mathematica notebook. The difference is that wxMaxima document is a text file, in fact a valid maxima file. Therefore, all the formatting information is stored in the comments. Here is an example:

/* [wxMaxima: comment start ]
In Proposition 1, we define a transformation $A_i$. Since we are only interested
in the symmetric arrival case, we only need to work with
   [wxMaxima: comment end   ] */

/* [wxMaxima: input   start ] */
Ap(n) := 1 - (1-p)^(n+1) ;
/* [wxMaxima: input   end   ] */

The display output in wxMaxima looks really nice. But, for some unexplainable reason, I can only think logically when I am editing text using vim. All the moving up and down in the wxMaxima window makes me lose my train of thought. Writing an elaborate markup so that wxMaxima pretty prints the comments was too cumbersome, so I needed to figure out how to write a script in maxima. (On second thoughts, maybe using the wxMaxima syntax was not too bad. But then, all progress is made by the unsatisfied man!)

Now, Maxima is meant to be an interactive program. So, it echos everything that it reads and it is not easy to distinguish the desired output from the rest. After a bit of digging into the manual, I found that ending a line with $ instead of ; does not show the output. However, if you read the file using batch, the input lines are still displayed, which is a bit annoying. After more digging around in the manual, I realized that I should be loading the file using batchload—which does not display input or output lines; only the output of print and display are shown. Unfortunately, maxima does not have a command line switch to batchload a file, but that can be overcome using --batch-string flag.

In short, if you are looking to use maxima as a normal programming language:

  1. End your lines with $ rather than ;
  2. Run the program using maxima --batch-string="batchload(\"filename\")$"

and be aware that maxima is meant for interactive use. For some functions, it uses asksign which requires explicit user intervention. Of course, there are workarounds.

Edit: Roman pointed out another neat trick; run your script as an initialization file using --init-mac. The only trouble is that init-mac is meant for initialization files, so after


maxima -q --init-mac=filename

you end up in interactive mode. To overcome that, either add an explicit quit(); at the end of your script, or add the options --batch-string="quit();".

Imitating Metapost using Haskell

Brent Yorgey has released a Haskell EDSL for creating diagrams. Here is my first impression of the library, comparing it with another DSL for creating diagrams — Metapost.

lineLets start with the simplest example: drawing a straight line between two points. First the metapost code

draw (0,0)--(100,100)
withpen pencircle scaled 4bp withcolor red;

Now the same code in Haskell (this needs to be wrapped around in a function and rendered to a png or pdf, but I’ll leave that part out).

lineColor red . lineWidth 4 .
  straight $ pathFromVectors [(0,0),(100,-100)]

I like the Metapost style better (since I have been using that for ages). One minor annoyance with diagrams is that positive values of y goes downwards rather than upwards. But the main trouble for me is prefix versus infix style. For diagrams, I find the infix style much easier to read. In this post, I will try to persuade Haskell to follow that style. The transforms in the diagram package have the signature Something -> Diagram -> Diagram, while I want Diagram -> Something -> Diagram. That is easy to fix

linecolor  = flip lineColor
linewidth  = flip lineWidth
fillcolor  = flip fillColor

Now we can use

straight (pathFromVectors [(0,0),(100,-100)])
  `linewidth` 4 `linecolor` red

circleNow lets try the next simplest example: drawing a circle. Again, the metapost code first

fill fullcircle scaled 100
withcolor 0.5green ;

draw fullcircle scaled 100
withpen pencircle scaled 3 withcolor 0.5blue ;

Now using the above flipped functions, we can draw this by (there is a minor difference in metapost and diagrams. Metapost draws a circle of  a given diameter while diagrams draws a circle of a given radius).

circle 50
  `linewidth` 3 `linecolor` lightblue `fillcolor` lightgreen

Ah, much better than the metapost version.

I need to figure out two more things to be able to use the diagrams package in my publications.

  • How to define the infix equivalents of metapost’s --, ---, .., and ... in Haskell.
  • Write a Metapost/ConTeXt backend to render text using TeX.

Here is the complete code

import Graphics.Rendering.Diagrams

linecolor  = flip lineColor
linewidth  = flip lineWidth
fillcolor  = flip fillColor

example1 =
straight (pathFromVectors [(0,0),(100,-100)]) `linewidth` 4 `linecolor` red

example2 =
circle 50 `linewidth` 3 `linecolor` lightblue `fillcolor` lightgreen

main = do
renderAs PNG "line.png" (Width 100) example1
renderAs PNG "circle.png" (Width 100) example2

Assertions in Haskell

Sometime back, I posted how to solve the Monty Hall problem with a million doors. Christophe Poucet wondered if I was making a mistake in my solution: maybe I was not ensuring that the host is not opening a door with a car. I did not think so, but then I have been wrong in the past.  So, I wanted to check my solution. First I thought of using QuickCheck to test if the host function was correct, but that would mean that I write a test property and then call it to check that everything was correct. And then make sure that there was no difference between the test and the main problem. But I am lazy, and wanted an easier solution. Enter the assert function from Control.Exception. I just made two changes in my previous program.

  1. I added a new import
    import Control.Exception (assert)
    
  2. I changed the randomVariable function to
    randomVariable :: (Door -> [Door] -> Door) -> MC Double
    randomVariable strategy = do
      u <- nature
      c <- contestant
      h <- host (options u c)
      let r = assert (all (\d -> open u d == Goat) h) $ reward u (strategy c h)
      return r
    

recompiled the code, and voila, no error messages. This means that my reasoning was correct. And now I can assert that my reasoning is correct. See GHC documentation and Haddock for more details on assertions.

Problem of Points using Monte Carlo

A Franciscan monk named Luca Paccioli wrote a book Summa de arithmetic, geometrica et proportionalitià in 1494, where is posed the now famous problem of points.

A and B are playing a fair game of balla. They agree to continue until one has won six rounds. The game actually stops when A has won five and B three. How should the stakes be divided.

This problem gave birth to modern probability theory. Read The Unfinished Game by Keith Devlin for a fascinating account of how Pascal and Fermat struggled to find a solution to this problem. In the book Devlin says

Interestingly, as far as we know, neither Pascal, Fermat, nor anyone else sought to resolve the issue empirically. If you were actually to play out the completion of the game many times— that is, imagine the game had been halted after eight tosses, with A ahead five to three, and then toss actual coins to complete the game—you would find that A wins roughly 7/8 of the time.

(I have reworded the above quote to match with the problem statement.)

Sounds like Devlin is asking us to verify the claim by Monte Carlo simulation. Lets do that.

import Control.Applicative ( (<$>) )
import Control.Monad.MC
import Text.Printf (printf)

Lets start by defining the game. We have two players

data Player = A | B deriving (Eq, Show)
players = [A,B]

Define a record to store the current state of the game.

data Game = Game { pointsA :: Int
                 , pointsB :: Int
                 , pointsTotal :: Int } deriving Show
game :: Game
game = Game 5 3 6

At any round, a player can win with equal probability. So,

play :: MC Player
play = sample 2 players

Lets create a sequence of tosses. Notice that the game must end after (pointsTotal - pointsA) + pointsTotal - pointsB) - 1 rounds.

tosses :: MC [Player]
tosses = sequence . replicate count $ play where
  count = (total - a) + (total - b) - 1
  total = pointsTotal game
  a     = pointsA     game
  b     = pointsB     game

We need to keep track of the current score of each player. I use a tuple to store the score, and use the following function to update the score.

updateScore :: (Int, Int) -> Player -> (Int, Int)
updateScore (a,b) A = (a+1,b)
updateScore (a,b) B = (a, b+1)

Next I define a function that keeps track of the score as the game progresses.

scores :: MC [(Int,Int)]
scores = scanl updateScore (a,b) <$> tosses where
  a = pointsA game
  b = pointsB game

Given a sequence of scores,  I can determine who won the game.

winner :: [(Int, Int)] -> Player
winner []         = error "Calculating winner of an empty sequence"
winner ((a,b) : xs) | a >= total  = A
                    | b >= total  = B
                    | otherwise   = winner xs
  where
      total = pointsTotal game

Now, we need to compute the probability that A wins the game. Since, I am using a Monte Carlo simulation, I need a random variable to keep track of the number of times player A has won. This can be done by simply assigning a reward of 1 when player A wins and a reward of 0 when player 2 wins. That way, the expected value of the reward will be equal to the probability that A wins.

reward :: Player -> Double
reward A = 1
reward B = 0

And finally, the Monte Carlo simulation.

main :: IO ()
main = let
  n     = 1000000
  seed  = 101
  stats = repeatMC n (reward <$> winner <$> scores) `evalMC` mt19937 seed
  in do
      printf "Probability that A wins : %f\n" (sampleMean stats)
      printf "99%% Confidence interval : (%f, %f)\n"
               `uncurry` (sampleCI 0.99 stats)

This gives

Probability that A wins : 0.8749600000000389
99% Confidence interval : (0.8741080072900288, 0.8758119927100491)

The actual solution of the problem is 7/8=0.875. So, the problem would have been resolved much earlier had the Renascence mathematician had access to Monte Carlo simulations 🙂

The tale of two tiling WMs

I love tinkering with my system. About a year ago, I got this crazy idea of trying different window managers (WM). Until then, I had mostly used GNOME and occasionally explored KDE. Before that I was a happy Windows XP users (that means, I did not know what I WM was!). After trying XFCE and a couple of *boxes, I came across tiling window managers. The first one that I tried was wmii (actually wmii-ruby), which I found to be very nifty. At that time, I was gung-ho about ruby, and was elated to use a WM based on ruby. After a while, I found wmii-ruby to be a bit slow. The reason the wmii was slow was ruby. The bare bones wmii was amazingly fast. So, I switched to it. But wmii is that it is very poorly documented and uses an arcane language to configure (plan9!!). It is a struggle to figure out how to change anything. There are ruby, phython, and perl ports, but then you have to deal with the idiosyncrasy of the  port.

These days I am gung-ho about Haskell. What luck then that Haskell has a tiling window manager — XMonad. Whenever I feel frustrated with wmii, I check out XMonad. It is developing at an amazing pace, is well documented (well, if you know how to read Haskell documentation) and comes with an awful lot of addons called xmonad-contrib. However, everytime I try XMonad, after a while I switch back to wmii.

After repeating this process for the n-th time, I decided to write down what is wrong with XMonad.  The major inconvenience is the tags and the key bindings. In wmii, you can have as many alpha-numeric tags as you want. In XMonad, you have nine numeic tags. With alphanumeric tags, I can name the tags based on what project I have on them, and I do not need to remember it. I look at the tag name and I know which project is there. With XMonad, things are a bit different. Tags 1 and 2 are easy (mail and browser), but tags after that are a mess. What happens when I start a project on tag 3, start another on 4, and another on 5. After a week, the first project is over. I remove everything from tag 3, and reuse it. After three weeks, I need to carry the state of all my tags in my head. It is agonizing. And usually that  is the time when I switch back to wmii. I wish XMonad would implement named tags correctly. Which means, I do not have to change my config file and recompile xmoand to change the names of my tags!

My second gripe with XMonad is that its window management model does not fit my work model (rather the work model that I have gotten used to while using wmii). Most of the time, I am either writing a paper in tex, or writing code in haskell. In both cases, I have the screen split vertically. When working on a paper, I have two xterms on the left: one with vim displaying the tex file, and the other for compiling the file. On the right side,  I have three-four pdfs open (for reference), with the column in stack mode, so that I can read the title of the pdf file. The situation is similar while coding in haskell. Vim with Haskell code on one side, ghci on the other. Another xterm for compiling the code. And a couple of browser windows for checking the documentation. So, I work in a two column mode, with two main windows: the editor and the typeset pdf, or the editor and the debugger. XMonad’s philosophy is having one main window.  That is really irritating to me. There are user packages that allows you to split the screen in two columns, where I can achieve something similar to stacked mode in each column, but the key bindings are horrible (I mean, I might as well use Emacs).

I love XMonad, especially the new prompt and integration with xmobar. And I really like hacking around in Haskell to configure my WM. But, XMonad comes in the way. And after fighting with it for a while, I realize that there is nothing really wrong with wmii. And I switch back to wmii. I hope XMonad will implement proper named tags and a true two column layout sometime in the future.