Author Archive

Icelandic government math

A good friend sent this to me just now: The Icelandic government seems to use their own rules for arithmetic. From the Letter of Intent they sent to IMF:

The collapse of the banking system has left us with considerable external financing needs. We estimate this need to be about $25 billion during the period from now and until the end of 2010. Of this, about $19 billion are composed by arrears on obligations of the three intervened private banks as well as financing earmarked for payments in relation to foreign deposits, leaving a cash financing need of $5 billion.

In case you missed it - it says that 25 - 19 = 5.

(note: this post makes no judgement of what amounts are needed, just pointing out the math, and yes - I realize it is probably a rounding error)

Generating legal subsets of a dependency graph

This post is literate Haskell

Update: Almost immediately after posting this, I started receiving helpful comments that perhaps introducing multithreading is an overkill here. While I do like the notion of asynchronous message passing via MVars, the comments do have a strong point. Instead of changing the post, and thereby doing the commenters a disservice by invalidating their comments, I just put a non-threaded version of this on gist.github. Indeed, it is simpler than what is found on this page and even a little bit faster :)

I lurk on several mailing lists, one of them being the MochiKit list. One thing recently discussed there is a tool to let users of MochiKit create their own customized version of it. This entails a GUI that allows a user to select which module he or she needs, and the webapp takes care of including all the dependencies as well.

This led me to a fun exercise. Say we have a dependency graph, which by nature must by a directed acyclic graph (DAG), otherwise you could have circular dependencies. Now, if you have N modules with no interdependencies, then there are 2^N possible ways of choosing a subset. However, if you restrict yourself only to those subsets that work, i.e. those where all dependencies of included modules are also included, then that puts a limit on this number.

For example, if you have two modules Base and Sub, and the latter depends on the former, then you would not allow the set containing just Sub. In mathematical terms, you only allow closed subsets of modules, where closed means:

A subset S of modules is closed if and only if the following holds:

  • If M is a module in S, and M depends on module P, then P is also in S

MochiKit contains around 15 modules and their dependency graph looks like this:

MochiKit dependencies diagram

I wanted to know if it was feasible to pre-generate all legal combinations of modules, and the main factor of that is exactly how many are there. Clearly there are fewer than the roughly 16.000 which there would be if there were no dependencies. Generating all dependency-closed subsets seemed like a very classical graph theoretical problem so someone must have written an algorithm for it. I asked my friend and professor, Magnús Már, for some terms to put into Google. Instead he gave me a hint on how to solve this.

The hint is this: To every closed subset of modules, there is a distinct anti-chain of modules that represents that set. An anti-chain of modules is a collection of modules such that no two depend on each other, neither directly nor indirectly.

For example, say we want the modules Logging, DOM, Style, Selector and Visual. That means we have to take all the modules in the green area below. The anti-chain for this set of modules are the blue modules.

Antichain visualization

You can see how the set of the blue modules is the smallest set of modules needed to pull in all the modules that we want.

Since there is a one-to-one correspondence between anti-chains and legal subsets, then we need only find the anti-chains and we can generate the subsets.

First some preliminaries,

> module Main where
> 
> import Control.Concurrent (forkIO)
> import Control.Concurrent.MVar
> import Data.Maybe (fromJust)
> import Data.List (nub, intersect)

We then set up our data as a simple list of pairs, where the first element is the module name and the second is a list of its dependencies. This list must be ordered in topological order, meaning that a module may not appear before any of its dependencies. This is easy to do by hand by looking at the above picture.

> dependencies = [
>  ("Base",[]),
>  ("DateTime",["Base"]),
>  ("Format",["Base"]),
>  ("Iter",["Base"]),
>  ("Async",["Base"]),
>  ("DOM",["Base"]),
>  ("Style",["Base"]),
>  ("Color",["DOM", "Style"]),
>  ("Logging",["Base"]),
>  ("LoggingPane",["Logging"]),
>  ("Selector",["DOM", "Style"]),
>  ("Signal",["DOM", "Style"]),
>  ("Visual",["Color"]),
>  ("DragAndDrop",["Iter","Signal","Visual"]),
>  ("Sortable",["DragAndDrop"])
>  ]
> 
> -- For convenience, derive the list of module names 
> modules = map fst dependencies

The first thing we need to do is to generate all anti-chains. We will do this in a dedicated thread and feed the anti-chains as we find them, one by one, on an MVar. Note that the MVar holds a Maybe value, this is so that we can indicate the end by sending Nothing.

The algorithm is simple backtracking through recursion, we start with the empty set which is always an anti-chain. At each step, we do the following.

  1. We check if the candidate element is dependent on any element in the anti-chain already. If it is not, we can add it and recurse.

  2. Since the anti-chain is still valid without the element, we also recurse on the unchanged anti-chain but discard the candidate element.

The recursion bottoms up when there are no more candidate elements to consider. Note that the topological order of candidates is crucial here.

The helper function accepts determines if a candidate is allowed in a chain. This is simply done by closing the candidate (see later) and seeing if there is an overlap with the current anti-chain.

> antichains :: MVar (Maybe [String]) -> IO ()
> antichains channel = antichains' [] modules >> putMVar channel Nothing
>     where
>       antichains' :: [String] -> [String] -> IO ()
>       antichains' ac [] = putMVar channel (Just ac) >> return ()
>       antichains' ac (candidate:rest) =
>           do if ac `accepts` candidate
>                 then antichains' (candidate:ac) rest
>                 else return ()
>              antichains' ac rest
>       chain `accepts` candidate = null $ intersect (closure [candidate]) chain

Above we mentioned closing a module, or more precisely a set of modules. This means to take a set of modules and adding all modules needed to make it closed with respect to dependencies. This is used both in the accepts helper above but also to map the anti-chains to what we ultimately want, the closed sets.

> closure :: [String] -> [String]
> closure ms = 
>     let missing = nub $ filter (not . flip elem ms) 
>                   $ concatMap (fromJust . flip lookup dependencies) ms
>     in
>       if null missing
>          then ms
>          else closure (ms ++ missing)

The anti-chain generator above feeds the anti-chains to an MVar. The following IO action is run in a second thread and it eats off the anti-chains from the MVar, closes them up and prints. We use a second MVar to notify the main thread when we are done.

> printer :: MVar () -> MVar (Maybe [String]) -> IO ()
> printer stop in_ = 
>     do v <- takeMVar in_
>        case v of
>          Just xs -> putStrLn $ show $ closure xs
>          Nothing -> putMVar stop ()
>        printer stop in_

Finally, to tie it all together, the main action creates the MVars for communication and sparks separate threads for the anti-chain generation and for the printing.

> main :: IO ()
> main = do chan <- newEmptyMVar
>           stop <- newEmptyMVar
>           forkIO $ printer stop chan
>           forkIO $ antichains chan
>           takeMVar stop
>           return ()

That’s it! The result: For the dependency graph of MochiKit above, there are only 817 legal combinations of modules. Actually - 816 if you discount the empty one :)

I’ll leave it up to someone else to answer the original question about feasibility of generating and storing that number of files.

Two handy Ubiquity commands

This blog has been very quiet, I guess our contributors are pretty busy with school work. Or at least they were. :) Since I’m pretty busy with mine, here are two ultra short tips.

Ubiquity is an ultra-handy (and geek friendly) tool for Firefox that allows for text commands. Today I created two Ubiquity commands that I find rather handy - especially for redditors :)

Automatic Markdown links

I write this blog in the Markdown markup format, which is the same format that is used for reddit comments (which I have been known to make occasionally). When writing blog posts, I often find that I mention a word or a phrase that is naturally turned into a link. This means I have to open a new tab, Google the word or phrase - open the relevant page and copy the URL into my post or comment, making sure it is in the correct Markdown format for links. Now, Ubiquity is very apt at solving these problems, so I wrote a command that allows me to do one of two things:

  1. I write a phrase, say “Reykjavik University”. I then select that phrase, invoke ubiquity (in my case it is Option+Space) and type “mdlink” and hit enter. This performs a Google search on the phrase, picks the first result and uses its URL to make the phrase a proper Markdown link, replacing the selected text: [Reykjavik University](http://www.ru.is/)

  2. Instead of typing the phrase and selecting it, I can also just invoke Ubuiquity and type “mdlink Reykjavik University” directly in the prompt - with the same result.

The command can be found and installed from Gist.github. Notice the (very cool) feature of Gist that it recognizes Ubiquity commands and inserts the appropriate markup to make FF offer you to install it.

Follow this link to see and/or install the “Markdown link” command

jQuery injection

When browsing, I often find the need to do it programmatically, executing Javascript in the context of a 3rd party webpage. For this it is very handy to have something like jQuery on hand and thus I’ve used a bookmarklet to inject jQuery into any web-page. That makes it very easy to do fancy stuff (like down-voting every comment by a reddit user) through the Firebug JS console.

I converted the bookmarklet linked above into a simple Ubiquity command, allowing for quicker access to the “Inject jQuery” operation. You can get the bookmarklet on its gist.github entry:

Follow this link to see and/or install the “Inject jQuery” command

Parsing annotated postfix operators with Haskell

I want to show you a neat trick method when constructing parsers with Haskell and Parsec.

While writing a parser for CCS, a process description language (don’t worry too much about it), I found I needed to parse constructs with a suffix operator. To provide a little context, the relevant grammar is this.

P ::= 0 | a.P | ... | P[f] | P \ S

where … indicates I’ve omitted some irrelevant stuff. Here, P is a process - the entity CCS is meant to describe. a is a so-called action label, just think of it as a generic label. a.P stands for a process that can perform an a action and then become the process specified after the dot. 0 (zero) stands for a special process that cannot do anything, the deadlocked process.

f is a mapping from such labels to labels, called a relabelling and is commonly specified with a list of pairs b/a,d/c,... which indicates that a maps to b, c to d etc. S is a set of such labels, represented in the normal manner with curly braces {a,b,c,...}. The construct P \ S is called a restriction, informally it means P cannot perform any of the actions in S.

In summary,

  • a.a.0 is a process that can do two a actions and then stop
  • P [b/a] is whatever P is, except any a performed by P will be changed to a b
  • P \ {a,b} is whatever P is, except a or b actions are not allowed

Now, [f] and \ S can be considered postfix unary operators on processes, with some additional structure, the f and the S. Since they are postfix there is no need to define their relative precedence, their order is enough. I.e.

a.0 [b/a] \ {b} [d/c,e/d] \ {e,g}

is unambiguous. Our Haskell ADT to represent the abstract syntax tree looks like this,

type Action = String
type Relabeling = Map.Map Action Action
type Restriction = Set.Set Action
 
data Process
    = Null
    | Prefix Action Process
    | ...
    | Relabel Process Relabeling
    | Restrict Process Restriction
    deriving (Eq, Show, Ord)

We want the example above to be parsed into this value (indented for clarity),

Restrict (
    Relabel (
        Restrict (
            Relabel (
                Prefix "a" Null
            ) (Map.fromList [("a","b")])
        ) (Set.fromList ["b"])
    ) (Map.fromList [("c","d"),("d","e")]
) (Set.fromList ["e", "g"])

My initial attempt at writing a parser for these constructs was ugly and complex. I was trying in one rule to parse a bunch of relabellings and restrictions and construct the appropriate value from the process that had already been parsed, cluttering the rules for other constructs with this logic. I thus went to the best documentation of Haskell, the #haskellirc channel, and asked for opinions. Christophe Poucet (vincenz) showed me the light and pointed out that a parser could well return a function,

April 13th, 2008
15:24 < vincenz> Arnar: what about...
15:24 < vincenz> process_adapter :== relabeling | restriction
15:24 < vincenz> and have those return the type
15:24 < vincenz> Process -> Process

And therein lies the trick. Instead of constructing the correct value, I simply parse a sequence of adapters, each of which can be a relabeling or a restriction, in each step returning the function that wraps a process in the relevant constructors.

relabeling :: Parser Relabeling
relabeling = do rs <- brackets $ commaSep1 rename
                return $ Map.fromList rs
                where
                  rename :: Parser (Action,Action)
                  rename = do s1 <- lexeme action
                              symbol "/"
                              s2 <- lexeme action
                              return (s2,s1)
 
restriction :: Parser Restriction
restriction = do symbol "\\"
                 ss <- braces $ commaSep1 action
                 return $ Set.fromList ss
 
process_adapter :: Parser (Process -> Process)
process_adapter = do restr <- restriction
                     return (\p -> Restrict p restr)
                  <|>
                  do rel <- relabeling
                     return (\p -> Relabel p rel)

Very simple, very clean. Now, parsing a process followed by a sequence of such adapters is trivial. We then feed the process through the adapter functions in order, i.e.

adaptedProcess :: Parser Process
adaptedProcess = do p <- simpleProcess
                    adapters <- many process_adapter
                    return $ foldr (flip (.)) id adapters p

I especially like that last line, foldr (flip (.)) id adapters p. Remember adapters has the type [Process -> Process], i.e. a list of adapters. If this list is [f1,f2,...,fN], then

foldr (flip (.)) id adapters

will give us the composition of these in the correct order, namely

\x -> fN ( ... (f2 (f1 (id x))) ...)

All that’s left to do is to apply this on the original process and out pops a correctly wrapped one.

I think this is very neat and it showcases the power of first-class functions. If you don’t see the beauty here, feel absolutely free to write me off as eccentric ;o)

You can find the complete code and a PDF writeup of the CCS project right here.

Playing with Haskell’s lazy lists

I love Haskell. I love many things about it, but one of its main attractions for me is that it is lazy. I guess I like that because I’m lazy myself.

Lazyness in Haskell means that we get infinite lists (or streams) for free - i.e. they don’t need any special handling over finite lists. Granted, other languages have infinite lists, but in Haskell any regular list has the natural potential of being infinite so we don’t need any specific syntax or constructors for them.

So, when I read about a Rutgers graduate student named Eric Rowland that recently discovered a simple prime-number generating formula, I decided to try it out in Haskell. Turns out it is extremely simple.

The generating formula is based on the sequence a(n) that you get by setting a(1) = 7 and then for n ≥ 2:

a(n) = a(n-1) + gcd(n,a(n-1))

In Haskell, running in interactive mode (GHCi), we define this by

Prelude> let a' = (1,7) : (map ( \(n,x) -> (n+1, x + (gcd (n+1) x))) a')
Prelude> take 10 a'
[(1,7),(2,8),(3,9),(4,10),(5,15),(6,18),(7,19),(8,20),(9,21),(10,22)]
Prelude> let a = map snd a'
Prelude> take 23 a
[7,8,9,10,15,18,19,20,21,22,33,36,37,38,39,40,41,42,43,44,45,46,69]

Note that each element of a’ is a tuple of the index n and the value a(n). By taking differences between successive values of this sequence, you always get either the number 1 or a prime number (proving this is Rowland’s contribution).

Prelude> let d = zipWith (-) (tail a) a
Prelude> take 23 d
[1,1,1,5,3,1,1,1,1,11,3,1,1,1,1,1,1,1,1,1,1,23,3]

So far, so good. Now in one step we remove the 1-s by using filter and get rid of duplicates by using nub (which is defined in the List module).

Prelude> :m +List
Prelude List> let primes = nub $ filter (/= 1) d
Prelude List> take 23 primes
[5,3,11,23,47,101,7,13,233,467,941,1889,3779,7559,15131,53,30323,60647,121403,242807,19,37,17]

Neat! The list primes is of course infinite and we could pluck the values off it as long as we like, and Haskell’s laziness will make sure values are only computed on demand. Rowland’s paper contains the details of the math.

Adobe Lightroom 2 beta

I know this is not programming, but one has to go out once in a while. Of course, after going outside, once I’m home I’m right back on the computer (my wife will be happy to tell you this).

I went on the Golden circle excursion with the ICALP attendees today. This is the “standard” tour given to most visitors and involves visiting the area around Geysir, Gullfoss and Þingvellir. After I got back I started selecting photos to put on the ICALP photo-webpage. I’ve been trying out Adobe’s new beta of Lightroom and I must say I like it. It has all the same features as the previous version, but feels considerably snappier.

One new feature is selective retouching, i.e. you paint a mask and can make some exposure/color changes to selected areas. Of course, performance goes out the window as soon as you do that but it is quite powerful and saves you from starting Photoshop.

While at Gullfoss, I took the following picture of Bláfell, a mountain north-east of there (if I’m incorrect and you know better, please correct me in the comments). The picture straight from the camera wasn’t that great…

… so I played with it for about a minute in Lightroom…

Edited version

While it is still not great, I think it is quite good less than 60 seconds of work. The correctios I made were exposure correction, removing the dirt from the lens with the spot tool and then I made the mountain and sky a little more pronounced by upping the saturation and using the clarity slider (which I think is just the high-pass sharpening technique).

So.. try it out yourself if you like photography.

ICALP’08, pre-conference workshops and first day

So ICALP 2008 is on. I’m fortunate enough that it is held at my school (Reykjavik University) and that I was offered the chance to sit in on the talks in return for helping a little with the organization and administration.

The activities started on Friday with the DYNAMO 2008 PhD school. I didn’t attend many of the lectures, but the ones I did were very interesting. To name one, Elias Koutsoupias gave informative talks on Game Theory and its use for example in online auctions.

Yesterday, Sunday, there were several workshops. I mostly attended SOS’08, a workshop on Structural Operational Semantics. Dale Miller, a hard-core logician, gave an excellent invited talk on formalizing SOS specifications in logic and although I didn’t understand half of it I somehow found it very interesting. His slides are available at his website.

Today was the first day of the conference proper and started off with an invited talk by Google’s S. Muthukrishnan, where he explained how Google runs auctions to decide what ads are displayed along with search results. Attendance to the talk was excellent, and while I had my hands full with other duties, I can safely say it was interesting on many levels.

After lunch I mostly attended track B, on logic, semantics and theory of programming. I’m afraid I can’t say much of merit about the talks as I had to do my best to understand them properly myself. I can say that this first day of ICALP went very smoothly in my opinion. As a bonus, the weather turned out to be very nice so in the afternoon and during the welcome reception, people could stand outside and enjoy the fresh air.

For photos from ICALP, check out our Picasa web album (updated at the end of each day). For more authoritative information on some of the talks, check out MohammadReza’s guest posts on Luca’s blog

Parsing JSON with Haskell

Note: This post is literate Haskell, you can copy the whole text to a .lhs file and compile it with GHC.

A friend of mine was showing me a very simple and elegant parser for JSON, written in Scala. As an excercise, I decided to write one in Haskell, using the Parsec parser combinator library. Here’s a quick walkthrough.

First we need some imports,

> import qualified Data.Map as Map
> import Control.Monad (liftM)
> import Text.ParserCombinators.Parsec
> import qualified Text.ParserCombinators.Parsec.Token as T
> import Text.ParserCombinators.Parsec.Language

Now, to represent a JSON value, we use an algebraic data type. For simplicity, we’ll only worry about integers and not floating point numbers. The definition of the type is simple, to be concise we use single letter constructor names (not a good idea if you want to publish a proper library).

> data JSON = S String 
>           | I Integer 
>           | B Bool 
>           | L [JSON] 
>           | O (Map.Map String JSON)
>           deriving Show

Now, to make our life somewhat easier, we’ll make use of the facilities in the ParsecToken module. This module provides us with ways to create functions for generic lexers, which deal correctly with whitespace (and comments) in a language. The first step is to create a language definition, in our case it only specifies how JSON comments are delimited. From the definition we create a token parser which we call lexer.

> ldef = emptyDef { commentStart = "/*"
>                 , commentEnd = "*/"
>                 , commentLine = "//"
>                 }
> lexer = T.makeTokenParser ldef

We can now pass lexer to the convenience functions of ParsecToken. As is common with such parsers, we rebind their names in our module to spare us referring to lexer every time we use them.

> symbol        = T.symbol        lexer
> stringLiteral = T.stringLiteral lexer
> integer       = T.integer       lexer
> squares       = T.squares       lexer
> braces        = T.braces        lexer
> commaSep      = T.commaSep      lexer

And now the parser itself. The basic element is a JSON value. From now on I’ll include the type signatures to make things easier to follow, but they are of course not necessary.

> value :: Parser JSON
> value = liftM S stringLiteral
>     <|> liftM I integer
>     <|> liftM B boolean
>     <|> liftM L list
>     <|> liftM O object

A value is simply a choice between each of the basic types, a string, an integer, a boolean, a list of values or an object (which is simply a mapping from strings to values). We defer each to its own parser, and simply lift the value constructors to the parser monad. We get parsing of strings and integers completely for free from the ParsecToken module, but the other three we need to handle ourselves. The boolean case is simple,

> boolean :: Parser Bool
> boolean = do symbol "true"
>              return True
>       <|> do symbol "false"
>              return False

Surprisingly, the list is even simpler, due to the helpers from ParsecToken.

> list :: Parser [JSON]
> list = squares $ commaSep value

Parsing objects is also fairly simple, basically just like list except that instead of parsing a comma seperated list of values, we parse a comma seperated list of key/value pairs. Such a pair is handled by the nested keyValue parser.

> object :: Parser (Map.Map String JSON)
> object = liftM Map.fromList $ braces $ commaSep keyValue
>          where
>              keyValue = do key <- stringLiteral
>                            symbol ":"
>                            val <- value
>                            return (key, val)

And that’s it! To test this easily, we finally add a main action that simply parses a JSON value from standard input.

> main = getContents >>= parseTest value

Testing the parser

A simple test, including a comment in the middle:

arnarb-macbook:tmp arnarb$ runhaskell json.hs 
[1,"two",false,{"one":1, /* a comment here */ "two":2, "list":[]}]
< here I pressed ctrl-d to insert an EOF, rest is output >
L [I 1,S "two",B False,O (fromList [("list",L []),(”one”,I 1),(”two”,I 2)])]

As an more complex test, we run it on the first example from json.org (newlines and indent added to the output for clarity).

arnarb-macbook:tmp arnarb$ runhaskell json.hs 
{
  "glossary": {
    "title": "example glossary",
    "GlossDiv": {
      "title": "S",
      "GlossList": {
        "GlossEntry": {
          "ID": "SGML",
          "SortAs": "SGML",
          "GlossTerm": "Standard Generalized Markup Language",
          "Acronym": "SGML",
          "Abbrev": "ISO 8879:1986",
          "GlossDef": {
            "para": "A meta-markup language, used to create markup languages such as DocBook.",
            "GlossSeeAlso": ["GML", "XML"]
          },
          “GlossSee”: “markup”
        }
      }
    }
  }
}
< here I pressed ctrl-d to insert an EOF, rest is output >
O (fromList [
    ("glossary",O (fromList [
        ("GlossDiv",O (fromList [
            ("GlossList",O (fromList [
                ("GlossEntry",O (fromList [
                    ("Abbrev",S "ISO 8879:1986"),
                    ("Acronym",S "SGML"),
                    ("GlossDef",O (fromList [
                        ("GlossSeeAlso",L [S "GML",S "XML"]),
                        (”para”,S “A meta-markup language, used to create markup languages such as DocBook.”)
                    ])),
                    (”GlossSee”,S “markup”),
                    (”GlossTerm”,S “Standard Generalized Markup Language”),
                    (”ID”,S “SGML”),
                    (”SortAs”,S “SGML”)
                ]))
            ])),
            (”title”,S “S”)
        ])),
        (”title”,S “example glossary”)
    ]))
])

Finally, we should note that many people have done this before :)

Edits:

  • Added link to Scala version.
  • Small revisisons, thanks to kpreid on #haskell for pointing out braces and squares.
  • Stop lugging lexer around, thanks to Paczesiowa on reddit for useful suggestions.

Visualizing gameplaying algorithms

For the impatient: Scroll down for the video.

The 2008 world championship in General Game Playing is currently on. AI bots from various universities around the world compete in playing games they’ve never seen before. The bots act as web servers to which a game master connects and uploads a game description, a first-order logic description of game rules in a special language called GDL. The games can be single- or multi-player, but must be perfect information games.

I am proud to say that the world champions of last year (2007) are my friends and colleagues at RU, Hilmar Finnsson and Yngvi Björnsson. Their team, which now includes Gylfi Þór Guðmundsson, and their bot CADIAPlayer, are quite busy these days defending the title.

Update July 18th: The RU team succeeded in defending the title and CADIA Player is the 2008 General Game Playing champion.

CADIAPlayer relies on an implementation of the UCT algorithm. I’m not qualified to explain it here but if you are interested, Hilmars M.Sc. thesis has all the gory details. This weekend, me, Hilmar and Stefán Freyr did a little experiment with a very interesting piece of software called Ubigraph. Ubigraph is basically just a application that fires up a 3D window and starts listening for XML-RPC requests over HTTP. Through its XML-RPC API, client applications can create graph nodes, edges and alter their styles and Ubigraph lays them out in 3D with a powerful multi-core algorithm and displays it in the window. The user can then zoom and move around the graph. Updates to the graph are rendered in real-time.

To see the CADIAPlayer in action, Hilmar added commands to the code to log to a file whenever it created new nodes (representing game states), when nodes were deleted and when the player made an actual move. Me and Stefán cobbled together a small Python script, which Hilmar then tweaked some more (source below), that eats this log and sends the relevant commands to Ubigraph at a fixed rate. The result is a beautiful display of how the algorithm works and shows how it explores areas of the search graph that look interesting before deciding on a move.

Of course, words are useless here so here’s a video of it in action, playing a game of Othello. The video is quite long (18 minutes) so I recommend skipping around in it, the action starts after 13 seconds and the middle and the ending are also interesting. At about 3:50 you start to see loops forming, meaning two different lines of play reach the same state, and the tree actually becomes a graph.

The nodes represent states and edges represent a legal game move. As old nodes are removed from memory, you can see how their child nodes become disconnected, they snap away from the tree (chasing all child nodes for deletion is too expensive) but they are cleaned up in subsequent deletion phases. The pink node represents the current state of the game and the growth in the tree shows where CADIAPlayer is exploring and evaluating possible moves. The numbered nodes represent the actual playout so far and the numbers are the players confidence values, the closer they are to 100, the more optimistic it is about winning the game.

The log file is quite simple, the first character of a line determines the entry type. For the main purpose we have two types, "C 123 456" means create a new node with id 123 as child (successor) of node 456, and "D 123" which simply means delete node 123. There are other commands, but those are the main ones. The Python program that feeds this to Ubigraph looks like this,

    # -*- coding: utf-8 -*-
    import sys
    import time
    import xmlrpclib
    import itertools
    import gradient
 
    NPS = 100  # nodes per second
    server_url = 'http://127.0.0.1:20738/RPC2'
 
    playercolors = itertools.cycle(['#ff00ff', '#ffd9ff'])
 
    weight_colors = [ (0.0,   (1.0, 0.0, 0.0)), # red
                      (50.0,  (0.0, 0.0, 1.0)), # yellow
                      (100.0, (0.0, 1.0, 0.0)), # green
                    ]
 
    def weight2color(w):
        c = gradient.piecewise_interpolate(weight_colors, w)
        return '#' + ''.join(map(lambda x: ("%02x" % int(255*x)), c))
 
    if __name__ == '__main__':
 
        server = xmlrpclib.Server(server_url)
        G = server.ubigraph
 
        G.clear()
 
        current = None
        interval = 1.0/NPS
 
        seen = {}
        path = set()
 
        log = file(sys.argv[1], 'r')
        for line in log:
            line = line.strip()
            if not line:
                continue
            params = line.lower().split()
            cmd = params[0]
 
            if 'c' == cmd:
                new, parent = map(hash, params[1:])
                if new in seen:
                    if parent not in seen[new]:
                        G.new_edge(parent, new)
                        seen[new].append(parent)
                else:
                    seen[new] = [parent]
                    G.new_vertex_w_id(new)
                    G.set_vertex_attribute(new, 'color', weight2color(50))
                    if params[2] == '0':
                        current = new
                        G.set_vertex_attribute(new, 'color', playercolors.next())
                        path.add(new)
                    else:
                        e = G.new_edge(parent, new)
                        G.set_edge_attribute(e, "oriented", "true")
                    time.sleep(interval)
 
            elif 'd' == cmd:
                if hash(params[1]) not in path:
                    # Uncomment this next line and comment second next one if you want
                    # deleted nodes to be coloured gray instead of being nuked.
                    # G.set_vertex_attribute(hash(params[1]), 'color', '#666666')
                    G.remove_vertex(hash(params[1]))
                    time.sleep(interval)
 
            elif 'u' == cmd:
                n, w = hash(params[1]), float(params[2])
                G.set_vertex_attribute(n, 'color', weight2color(w))
                G.set_vertex_attribute(n, 'label', "%0.2f" % w)
 
            elif 'n' == cmd:
                new = hash(params[1])
                G.set_vertex_attribute(new, 'color', playercolors.next())
                current = new
                path.add(new)

Oh, and that gradient module is one that often comes in handy, here it is for sake of completeness.

# -*- encoding: utf-8 -*-
 
def interpolate(c1, c2, t):
    """
    Interpolates between two colours c1 and c2 by the variable t.
    t should be in the range [0.0, 1.0]. interpolate(c1,c2,0) will return
    c1 and interpolate(c1,c2,1) will return c2. interpolate(c1,c2,0.5) will
    return the color in the middle.
 
    The colours c1 and c2 should simply be number triplets (for example RGB values).
 
    Example:
 
    >>> c1 = (1,2,3)
    >>> c2 = (11,12,13)
    >>> t = 0.5
    >>> interpolate(c1,c2,t)
    (6.0, 7.0, 8.0)
    """
 
    if t <= 0: return c1
    if t >= 1: return c2
 
    ipol = lambda a, b, t: (1.0-t)*a + float(t)*b
 
    return (ipol(c1[0], c2[0], t),
            ipol(c1[1], c2[1], t),
            ipol(c1[2], c2[2], t))
 
    # equivalent:
    # from itertools import starmap
    # return tuple(starmap(ipol, zip(c1,c2,(t,t,t))))
 
def piecewise_interpolate(stops, t):
    """
    Gives a piecewise linear interpolation of values in stops by t.
    stops should be a list (or an iterable) of paris of the form (value,colour)
    where colour is a numerical triplet. t should be a real number.
 
    The function returns an interpolation of the colours in stops determined
    by the parameter t. If t falls out of the range of specified values, the
    appropriate end colours will be returned
 
    Examples:
    >>> stops = [ (0.0, (1.0, 0.0, 0.0)), # red
    ...           (3.3, (1.0, 1.0, 0.0)), # yellow
    ...           (6.7, (0.0, 1.0, 0.0)), # green
    ...           (10.0, (0.0, 0.0, 1.0)),# blue
    ...         ]
    >>> piecewise_interpolate(stops, 0.0)
    (1.0, 0.0, 0.0)
    >>> piecewise_interpolate(stops, 1.0)
    (1.0, 0.30303030303030304, 0.0)
    >>> piecewise_interpolate(stops, 5)
    (0.5, 1.0, 0.0)
    >>> piecewise_interpolate(stops, 10.0)
    (0.0, 0.0, 1.0)
    >>> piecewise_interpolate(stops, 11.0)
    (0.0, 0.0, 1.0)
    >>> piecewise_interpolate(stops, -1.0)
    (1.0, 0.0, 0.0)
    """
    stops = stops[:] # copy for sorting
    stops.sort()
    values, colours = zip(*stops)
    if t <= values[0]: return colours[0]    # below range
    if t >= values[-1]: return colours[-1]  # above range
 
    # find which "bin" t falls in
    bin = 0
    while t > values[bin]: bin += 1
 
    # normalize t in the interval (0,1]
    normalized_t = (t - values[bin-1]) / (values[bin] - values[bin-1])
 
    return interpolate(colours[bin-1], colours[bin], normalized_t)
 
def _test():
    import doctest
    doctest.testmod()
 
if __name__ == "__main__":
    _test()

What’s in a phone number?

One of the most tedious and tricky things a web developer does is designing good form validation. It poses interesting challenges in UI design, but also in not being a hurdle for interested users. One of the most common thing I experience when signing up for stuff is this.

And the reason… is this

(not my actual phone number) The problem here is that Iceland doesn’t use area or city codes - there is one uniform set of seven-digit phone numbers across the whole country. However, Apple doesn’t make room for that, it assumes all phone numbers must have an area code. In this case it is easy enough to get past this, I just move the first digit to the area code.

Another common error is in validating e-mail addresses. Many websites assume an email address has to be xxx@yyy.zzz where the x, y and z must be alphabetical or numerical or perhaps a dash (-) and a period(.). However, e-mail addresses in general are not constrained to that. A useful feature of GMail is that you can add a plus (+) after your username, followed by any string you want. For example user+aha@gmail.com gets routed to user@gmail.com, but the owner of that mailbox can filter on the +aha part if she wants. This is very handy when registering for something, you can indicate where you are registering in the address and set up filters to route messages to that address directly to a tag (such as the trash). The problem is that many (most?) websites don’t allow you to register an address containing the +.

Perhaps the silliest example was the one that this guy fell victim to, a checkbox that said “uncheck this box if you don’t want us to email you”, but then the form wouldn’t validate unless the box was checked. Ingenious.