The new version of Haskell package monte-carlo by Patrick Perry adds a
repeatMC function which is useful for some Monte Carlo experiments. In the course of implementing this, Patrick has also implemented a
Summary data type for keeping track of basic statistical properties of Monte Carlo experiments. This data type is also useful on its own, though there is no interface to use it independently. In this post I show how it can be used outside the MC monad.
import Control.Monad.MC import Data.List (foldl') import Text.Printf (printf)
Suppose we have some data in a list of Doubles that we want to analyze.
x = [1..10] :: [Double]
First we need to create a summary from the data. There is no pre-defined function for it, but we can easily define one on our own.
stats :: [Double] -> Summary stats = foldl' update summary
Now, we can use this functions to get the basic statistics of the data.
s = stats x
There are a few functions to extract the statistics from the summary. See
Control.Monad documentation for the complete list and description. Most useful ones for me are those that give basic properties of the data
sampleSize :: Summary -> Int
sampleMin :: Summary -> Double
sampleMax :: Summary -> Double
and those that give basic statistics of the data.
sampleMean :: Summary -> Double
sampleVar :: Summary -> Double
sampleSD :: Summary -> Double
We can obtain these functions as follows:
main = do putStrLn $ printf "Data : %s" (show x) putStrLn $ printf "Length : %d" (sampleSize s) putStrLn $ printf "Max : %f" (sampleMax s) putStrLn $ printf "Min : %f" (sampleMin s) putStrLn $ printf "Mean : %f" (sampleMean s) putStrLn $ printf "Var : %f" (sampleVar s) putStrLn $ printf "SD : %f" (sampleSD s)
Data : [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0] Length : 10 Max : 10.0 Min : 1.0 Mean : 5.5 Var : 9.166666666666666 SD : 3.0276503540974917
Sometime next week I will show how to use this data type in Monte Carlo simulations.