Don’t complicate your code

A friend sent me a link to a blog post that discusses solutions to the Josephus problem in Perl, Ruby, and Python. Ouch! Those implementations hurt.

The Josephus problem is a simple discrete math puzzle:

Flavius Josephus was a roman historian of Jewish origin. During the Jewish-Roman wars of the first century AD, he was in a cave with fellow soldiers, 40 men in all, surrounded by enemy Roman troops. They decided to commit suicide by standing in a ring and counting off each third man. Each man so designated was to commit suicide…Josephus, not wanting to die, managed to place himself in the position of the last survivor.

In the general version of the problem, there are n soldiers numbered from 1 to n and each k-th soldier will be eliminated. The count starts from the first soldier. What is the number of the last survivor?

Keep reading


Imitating Metapost using Haskell

Brent Yorgey has released a Haskell EDSL for creating diagrams. Here is my first impression of the library, comparing it with another DSL for creating diagrams — Metapost.

lineLets start with the simplest example: drawing a straight line between two points. First the metapost code

draw (0,0)--(100,100)
withpen pencircle scaled 4bp withcolor red;

Now the same code in Haskell (this needs to be wrapped around in a function and rendered to a png or pdf, but I’ll leave that part out).

lineColor red . lineWidth 4 .
  straight $ pathFromVectors [(0,0),(100,-100)]

I like the Metapost style better (since I have been using that for ages). One minor annoyance with diagrams is that positive values of y goes downwards rather than upwards. But the main trouble for me is prefix versus infix style. For diagrams, I find the infix style much easier to read. In this post, I will try to persuade Haskell to follow that style. The transforms in the diagram package have the signature Something -> Diagram -> Diagram, while I want Diagram -> Something -> Diagram. That is easy to fix

linecolor  = flip lineColor
linewidth  = flip lineWidth
fillcolor  = flip fillColor

Now we can use

straight (pathFromVectors [(0,0),(100,-100)])
  `linewidth` 4 `linecolor` red

circleNow lets try the next simplest example: drawing a circle. Again, the metapost code first

fill fullcircle scaled 100
withcolor 0.5green ;

draw fullcircle scaled 100
withpen pencircle scaled 3 withcolor 0.5blue ;

Now using the above flipped functions, we can draw this by (there is a minor difference in metapost and diagrams. Metapost draws a circle of  a given diameter while diagrams draws a circle of a given radius).

circle 50
  `linewidth` 3 `linecolor` lightblue `fillcolor` lightgreen

Ah, much better than the metapost version.

I need to figure out two more things to be able to use the diagrams package in my publications.

  • How to define the infix equivalents of metapost’s --, ---, .., and ... in Haskell.
  • Write a Metapost/ConTeXt backend to render text using TeX.

Here is the complete code

import Graphics.Rendering.Diagrams

linecolor  = flip lineColor
linewidth  = flip lineWidth
fillcolor  = flip fillColor

example1 =
straight (pathFromVectors [(0,0),(100,-100)]) `linewidth` 4 `linecolor` red

example2 =
circle 50 `linewidth` 3 `linecolor` lightblue `fillcolor` lightgreen

main = do
renderAs PNG "line.png" (Width 100) example1
renderAs PNG "circle.png" (Width 100) example2

Assertions in Haskell

Sometime back, I posted how to solve the Monty Hall problem with a million doors. Christophe Poucet wondered if I was making a mistake in my solution: maybe I was not ensuring that the host is not opening a door with a car. I did not think so, but then I have been wrong in the past.  So, I wanted to check my solution. First I thought of using QuickCheck to test if the host function was correct, but that would mean that I write a test property and then call it to check that everything was correct. And then make sure that there was no difference between the test and the main problem. But I am lazy, and wanted an easier solution. Enter the assert function from Control.Exception. I just made two changes in my previous program.

  1. I added a new import
    import Control.Exception (assert)
  2. I changed the randomVariable function to
    randomVariable :: (Door -> [Door] -> Door) -> MC Double
    randomVariable strategy = do
      u <- nature
      c <- contestant
      h <- host (options u c)
      let r = assert (all (\d -> open u d == Goat) h) $ reward u (strategy c h)
      return r

recompiled the code, and voila, no error messages. This means that my reasoning was correct. And now I can assert that my reasoning is correct. See GHC documentation and Haddock for more details on assertions.

Problem of Points using Monte Carlo

A Franciscan monk named Luca Paccioli wrote a book Summa de arithmetic, geometrica et proportionalitià in 1494, where is posed the now famous problem of points.

A and B are playing a fair game of balla. They agree to continue until one has won six rounds. The game actually stops when A has won five and B three. How should the stakes be divided.

This problem gave birth to modern probability theory. Read The Unfinished Game by Keith Devlin for a fascinating account of how Pascal and Fermat struggled to find a solution to this problem. In the book Devlin says

Interestingly, as far as we know, neither Pascal, Fermat, nor anyone else sought to resolve the issue empirically. If you were actually to play out the completion of the game many times— that is, imagine the game had been halted after eight tosses, with A ahead five to three, and then toss actual coins to complete the game—you would find that A wins roughly 7/8 of the time.

(I have reworded the above quote to match with the problem statement.)

Sounds like Devlin is asking us to verify the claim by Monte Carlo simulation. Lets do that.

import Control.Applicative ( (<$>) )
import Control.Monad.MC
import Text.Printf (printf)

Lets start by defining the game. We have two players

data Player = A | B deriving (Eq, Show)
players = [A,B]

Define a record to store the current state of the game.

data Game = Game { pointsA :: Int
                 , pointsB :: Int
                 , pointsTotal :: Int } deriving Show
game :: Game
game = Game 5 3 6

At any round, a player can win with equal probability. So,

play :: MC Player
play = sample 2 players

Lets create a sequence of tosses. Notice that the game must end after (pointsTotal - pointsA) + pointsTotal - pointsB) - 1 rounds.

tosses :: MC [Player]
tosses = sequence . replicate count $ play where
  count = (total - a) + (total - b) - 1
  total = pointsTotal game
  a     = pointsA     game
  b     = pointsB     game

We need to keep track of the current score of each player. I use a tuple to store the score, and use the following function to update the score.

updateScore :: (Int, Int) -> Player -> (Int, Int)
updateScore (a,b) A = (a+1,b)
updateScore (a,b) B = (a, b+1)

Next I define a function that keeps track of the score as the game progresses.

scores :: MC [(Int,Int)]
scores = scanl updateScore (a,b) <$> tosses where
  a = pointsA game
  b = pointsB game

Given a sequence of scores,  I can determine who won the game.

winner :: [(Int, Int)] -> Player
winner []         = error "Calculating winner of an empty sequence"
winner ((a,b) : xs) | a >= total  = A
                    | b >= total  = B
                    | otherwise   = winner xs
      total = pointsTotal game

Now, we need to compute the probability that A wins the game. Since, I am using a Monte Carlo simulation, I need a random variable to keep track of the number of times player A has won. This can be done by simply assigning a reward of 1 when player A wins and a reward of 0 when player 2 wins. That way, the expected value of the reward will be equal to the probability that A wins.

reward :: Player -> Double
reward A = 1
reward B = 0

And finally, the Monte Carlo simulation.

main :: IO ()
main = let
  n     = 1000000
  seed  = 101
  stats = repeatMC n (reward <$> winner <$> scores) `evalMC` mt19937 seed
  in do
      printf "Probability that A wins : %f\n" (sampleMean stats)
      printf "99%% Confidence interval : (%f, %f)\n"
               `uncurry` (sampleCI 0.99 stats)

This gives

Probability that A wins : 0.8749600000000389
99% Confidence interval : (0.8741080072900288, 0.8758119927100491)

The actual solution of the problem is 7/8=0.875. So, the problem would have been resolved much earlier had the Renascence mathematician had access to Monte Carlo simulations 🙂

The tale of two tiling WMs

I love tinkering with my system. About a year ago, I got this crazy idea of trying different window managers (WM). Until then, I had mostly used GNOME and occasionally explored KDE. Before that I was a happy Windows XP users (that means, I did not know what I WM was!). After trying XFCE and a couple of *boxes, I came across tiling window managers. The first one that I tried was wmii (actually wmii-ruby), which I found to be very nifty. At that time, I was gung-ho about ruby, and was elated to use a WM based on ruby. After a while, I found wmii-ruby to be a bit slow. The reason the wmii was slow was ruby. The bare bones wmii was amazingly fast. So, I switched to it. But wmii is that it is very poorly documented and uses an arcane language to configure (plan9!!). It is a struggle to figure out how to change anything. There are ruby, phython, and perl ports, but then you have to deal with the idiosyncrasy of the  port.

These days I am gung-ho about Haskell. What luck then that Haskell has a tiling window manager — XMonad. Whenever I feel frustrated with wmii, I check out XMonad. It is developing at an amazing pace, is well documented (well, if you know how to read Haskell documentation) and comes with an awful lot of addons called xmonad-contrib. However, everytime I try XMonad, after a while I switch back to wmii.

After repeating this process for the n-th time, I decided to write down what is wrong with XMonad.  The major inconvenience is the tags and the key bindings. In wmii, you can have as many alpha-numeric tags as you want. In XMonad, you have nine numeic tags. With alphanumeric tags, I can name the tags based on what project I have on them, and I do not need to remember it. I look at the tag name and I know which project is there. With XMonad, things are a bit different. Tags 1 and 2 are easy (mail and browser), but tags after that are a mess. What happens when I start a project on tag 3, start another on 4, and another on 5. After a week, the first project is over. I remove everything from tag 3, and reuse it. After three weeks, I need to carry the state of all my tags in my head. It is agonizing. And usually that  is the time when I switch back to wmii. I wish XMonad would implement named tags correctly. Which means, I do not have to change my config file and recompile xmoand to change the names of my tags!

My second gripe with XMonad is that its window management model does not fit my work model (rather the work model that I have gotten used to while using wmii). Most of the time, I am either writing a paper in tex, or writing code in haskell. In both cases, I have the screen split vertically. When working on a paper, I have two xterms on the left: one with vim displaying the tex file, and the other for compiling the file. On the right side,  I have three-four pdfs open (for reference), with the column in stack mode, so that I can read the title of the pdf file. The situation is similar while coding in haskell. Vim with Haskell code on one side, ghci on the other. Another xterm for compiling the code. And a couple of browser windows for checking the documentation. So, I work in a two column mode, with two main windows: the editor and the typeset pdf, or the editor and the debugger. XMonad’s philosophy is having one main window.  That is really irritating to me. There are user packages that allows you to split the screen in two columns, where I can achieve something similar to stacked mode in each column, but the key bindings are horrible (I mean, I might as well use Emacs).

I love XMonad, especially the new prompt and integration with xmobar. And I really like hacking around in Haskell to configure my WM. But, XMonad comes in the way. And after fighting with it for a while, I realize that there is nothing really wrong with wmii. And I switch back to wmii. I hope XMonad will implement proper named tags and a true two column layout sometime in the future.

Haskell on Arch Linux

I was a happy user of Ubuntu until a few months ago when GHC 6.10 was released. Ubuntu still had GHC 6.8.2. Posts on haskell-cafe told that Arch Linux had support for latest GHC, and support for over 700 Haskell libraries. Since I had started using Haskell for a few projects, I decided ot check Arch linux out. After a long weekend of struggling through the installation and learning more about linux than I intended to know, I had all my usual programs up and running on Arch. Once I discovered AUR, I had the latest versions of almost everything. Or so I thought.

The first glitch was that Arch does not install the profiling version of the libraries. I asked around on #haskell, and I was told, that I should use cabal to install the profiling version of libraries. I did not want to have conflicting versions of libraries, so I uninstalled a few libs from the Arch repositories, and manually installed them using cabal.

And now I figured out that Haskell libraries are not so up-to-date on AUR. For example, new versions of monte-carlo and gsl-random were released more than a week ago, but AUR still has the old versions. So, I decided to install them using cabal.

After having used Arch for about two months, I am not convinced that it has a good support for Haskell. The only good thing is that it comes with a version of cabal that can upgrade itself. (From what I remember, the ubuntu version of cabal could not upgrade  itself, and I had to install cabal by hand). But once cabal is updated, it is much more convenient to use cabal than the Arch repositories. Arch is a great distribution and I  like the rolling release philosophy, but its support for Haskell is mediocre at best.

Most programming languages come with their own tools for installing libraries (CPAN for perl, gem for ruby, cabal for Haskell, and now even texlive has a tool for updating packages). I do not think that it is worth the effort to port packages to distribution repositories. Simply using the language’s tool for installing libraries is much simpler and much easier to maintain.

Monty Hall with a billion doors

In order to provide intuition behind the solution of the Monty Hall problem, Antonio Cangiano says:

If there were a billion doors, you picked one, and then Monty proceeded to open up all the remaining doors but one, we’d have a situation where it would be extremely unlikely that you picked the right door at the beginning, while it would be extremely likely that the remaining door was the one that was concealing the car.

Now, what happens when there are more than three doors. In this post, I will modify the solution of my last post to work for any number of doors. It requires a little change to the program.

 import Data.List (delete, (\\) )
 import Text.Printf (printf)
 import Control.Monad
 import Control.Monad.MC

This time, instead of defining Door by listing all alternatives, I define it as an instance of Bounded class.

 data Door   = Door Int deriving (Show, Eq)
 numDoors    = 3

 instance Bounded Door where
   minBound = Door 1
   maxBound = Door numDoors

 instance Enum Door where
   toEnum      n     = Door n
   fromEnum (Door n) =    n

 doors = [minBound .. maxBound] :: [Door]

Object are the same as before.

 data Object = Car | Goat deriving (Show, Eq)
 type Universe = [Object]

There is one car; all other doors have goats. This can still be generated using the shuffle function.

 nature :: MC Universe
 nature = shuffle numDoors (Car: repeat Goat)

The contestant open one door at random.

 contestant :: MC Door
 contestant = sample numDoors doors

The options available to the host remain the same as before.

 options :: Universe -> Door -> [Door]
 options universe door = filter (\d -> Goat == open universe d) remainder
   where remainder = delete door doors

We need a generic function to see what is behind a closed door.

 open :: Universe -> Door -> Object
 open universe (Door n) = universe !! (n-1)

The host opens all but one door. We can do this by replacing the sample function with sampleSubset function.

 host :: [Door]-> MC [Door]
 host o = sampleSubset (numDoors-2) l o where l = length o

The strategies are now whether to stick or switch based on the door that the contestant chose, and the doors that the host opened.

 strategyStick :: Door -> [Door] -> Door
 strategyStick d1 d2 = d1

 strategySwitch :: Door -> [Door] -> Door
 strategySwitch d1 d2 = head (doors \\ (d1:d2))

The reward function is same as before.

 reward :: Universe -> Door -> Double
 reward u d = if Car == open u d then 1.0 else 0.0

And so is the generation of the ranom experiment.

 randomVariable :: (Door -> [Door] -> Door) -> MC Double
 randomVariable strategy = do
   u <- nature
   c <- contestant
   h <- host (options u c)
   let r = reward u (strategy c h)
   return r

And we see the result.

 main = let
   n     = 100000
   seed  = 42
   stats = repeatMC n (randomVariable strategyStick) `evalMC` mt19937 seed
   in do
      putStrLn $ printf "\nnumDoors                : %d" numDoors
      putStrLn $ printf "Mean                    : %f" (sampleMean stats)
      putStrLn $ printf "99%% Confidence interval : (%f, %f)"
                `uncurry` (sampleCI 0.99 stats)

Now lets see the results. For

numDoors                : 3
Mean                    : 0.3334800000000033
99% Confidence interval : (0.3296397390134338, 0.33732026098657275)

numDoors                : 30
Mean                    : 0.03466999999999957
99% Confidence interval : (0.03317983597888913, 0.03616016402111001)

numDoors                : 300
Mean                    : 0.0038399999999999784
99% Confidence interval : (0.0033362101507491866, 0.00434378984925077)

Well, I think that you get the idea. I am not going to run this thing for a billion doors.