Posts tagged ‘Visualization’

Visualizing gameplaying algorithms

For the impatient: Scroll down for the video.

The 2008 world championship in General Game Playing is currently on. AI bots from various universities around the world compete in playing games they’ve never seen before. The bots act as web servers to which a game master connects and uploads a game description, a first-order logic description of game rules in a special language called GDL. The games can be single- or multi-player, but must be perfect information games.

I am proud to say that the world champions of last year (2007) are my friends and colleagues at RU, Hilmar Finnsson and Yngvi Björnsson. Their team, which now includes Gylfi Þór Guðmundsson, and their bot CADIAPlayer, are quite busy these days defending the title.

Update July 18th: The RU team succeeded in defending the title and CADIA Player is the 2008 General Game Playing champion.

CADIAPlayer relies on an implementation of the UCT algorithm. I’m not qualified to explain it here but if you are interested, Hilmars M.Sc. thesis has all the gory details. This weekend, me, Hilmar and Stefán Freyr did a little experiment with a very interesting piece of software called Ubigraph. Ubigraph is basically just a application that fires up a 3D window and starts listening for XML-RPC requests over HTTP. Through its XML-RPC API, client applications can create graph nodes, edges and alter their styles and Ubigraph lays them out in 3D with a powerful multi-core algorithm and displays it in the window. The user can then zoom and move around the graph. Updates to the graph are rendered in real-time.

To see the CADIAPlayer in action, Hilmar added commands to the code to log to a file whenever it created new nodes (representing game states), when nodes were deleted and when the player made an actual move. Me and Stefán cobbled together a small Python script, which Hilmar then tweaked some more (source below), that eats this log and sends the relevant commands to Ubigraph at a fixed rate. The result is a beautiful display of how the algorithm works and shows how it explores areas of the search graph that look interesting before deciding on a move.

Of course, words are useless here so here’s a video of it in action, playing a game of Othello. The video is quite long (18 minutes) so I recommend skipping around in it, the action starts after 13 seconds and the middle and the ending are also interesting. At about 3:50 you start to see loops forming, meaning two different lines of play reach the same state, and the tree actually becomes a graph.

The nodes represent states and edges represent a legal game move. As old nodes are removed from memory, you can see how their child nodes become disconnected, they snap away from the tree (chasing all child nodes for deletion is too expensive) but they are cleaned up in subsequent deletion phases. The pink node represents the current state of the game and the growth in the tree shows where CADIAPlayer is exploring and evaluating possible moves. The numbered nodes represent the actual playout so far and the numbers are the players confidence values, the closer they are to 100, the more optimistic it is about winning the game.

The log file is quite simple, the first character of a line determines the entry type. For the main purpose we have two types, "C 123 456" means create a new node with id 123 as child (successor) of node 456, and "D 123" which simply means delete node 123. There are other commands, but those are the main ones. The Python program that feeds this to Ubigraph looks like this,

    # -*- coding: utf-8 -*-
    import sys
    import time
    import xmlrpclib
    import itertools
    import gradient
 
    NPS = 100  # nodes per second
    server_url = 'http://127.0.0.1:20738/RPC2'
 
    playercolors = itertools.cycle(['#ff00ff', '#ffd9ff'])
 
    weight_colors = [ (0.0,   (1.0, 0.0, 0.0)), # red
                      (50.0,  (0.0, 0.0, 1.0)), # yellow
                      (100.0, (0.0, 1.0, 0.0)), # green
                    ]
 
    def weight2color(w):
        c = gradient.piecewise_interpolate(weight_colors, w)
        return '#' + ''.join(map(lambda x: ("%02x" % int(255*x)), c))
 
    if __name__ == '__main__':
 
        server = xmlrpclib.Server(server_url)
        G = server.ubigraph
 
        G.clear()
 
        current = None
        interval = 1.0/NPS
 
        seen = {}
        path = set()
 
        log = file(sys.argv[1], 'r')
        for line in log:
            line = line.strip()
            if not line:
                continue
            params = line.lower().split()
            cmd = params[0]
 
            if 'c' == cmd:
                new, parent = map(hash, params[1:])
                if new in seen:
                    if parent not in seen[new]:
                        G.new_edge(parent, new)
                        seen[new].append(parent)
                else:
                    seen[new] = [parent]
                    G.new_vertex_w_id(new)
                    G.set_vertex_attribute(new, 'color', weight2color(50))
                    if params[2] == '0':
                        current = new
                        G.set_vertex_attribute(new, 'color', playercolors.next())
                        path.add(new)
                    else:
                        e = G.new_edge(parent, new)
                        G.set_edge_attribute(e, "oriented", "true")
                    time.sleep(interval)
 
            elif 'd' == cmd:
                if hash(params[1]) not in path:
                    # Uncomment this next line and comment second next one if you want
                    # deleted nodes to be coloured gray instead of being nuked.
                    # G.set_vertex_attribute(hash(params[1]), 'color', '#666666')
                    G.remove_vertex(hash(params[1]))
                    time.sleep(interval)
 
            elif 'u' == cmd:
                n, w = hash(params[1]), float(params[2])
                G.set_vertex_attribute(n, 'color', weight2color(w))
                G.set_vertex_attribute(n, 'label', "%0.2f" % w)
 
            elif 'n' == cmd:
                new = hash(params[1])
                G.set_vertex_attribute(new, 'color', playercolors.next())
                current = new
                path.add(new)

Oh, and that gradient module is one that often comes in handy, here it is for sake of completeness.

# -*- encoding: utf-8 -*-
 
def interpolate(c1, c2, t):
    """
    Interpolates between two colours c1 and c2 by the variable t.
    t should be in the range [0.0, 1.0]. interpolate(c1,c2,0) will return
    c1 and interpolate(c1,c2,1) will return c2. interpolate(c1,c2,0.5) will
    return the color in the middle.
 
    The colours c1 and c2 should simply be number triplets (for example RGB values).
 
    Example:
 
    >>> c1 = (1,2,3)
    >>> c2 = (11,12,13)
    >>> t = 0.5
    >>> interpolate(c1,c2,t)
    (6.0, 7.0, 8.0)
    """
 
    if t <= 0: return c1
    if t >= 1: return c2
 
    ipol = lambda a, b, t: (1.0-t)*a + float(t)*b
 
    return (ipol(c1[0], c2[0], t),
            ipol(c1[1], c2[1], t),
            ipol(c1[2], c2[2], t))
 
    # equivalent:
    # from itertools import starmap
    # return tuple(starmap(ipol, zip(c1,c2,(t,t,t))))
 
def piecewise_interpolate(stops, t):
    """
    Gives a piecewise linear interpolation of values in stops by t.
    stops should be a list (or an iterable) of paris of the form (value,colour)
    where colour is a numerical triplet. t should be a real number.
 
    The function returns an interpolation of the colours in stops determined
    by the parameter t. If t falls out of the range of specified values, the
    appropriate end colours will be returned
 
    Examples:
    >>> stops = [ (0.0, (1.0, 0.0, 0.0)), # red
    ...           (3.3, (1.0, 1.0, 0.0)), # yellow
    ...           (6.7, (0.0, 1.0, 0.0)), # green
    ...           (10.0, (0.0, 0.0, 1.0)),# blue
    ...         ]
    >>> piecewise_interpolate(stops, 0.0)
    (1.0, 0.0, 0.0)
    >>> piecewise_interpolate(stops, 1.0)
    (1.0, 0.30303030303030304, 0.0)
    >>> piecewise_interpolate(stops, 5)
    (0.5, 1.0, 0.0)
    >>> piecewise_interpolate(stops, 10.0)
    (0.0, 0.0, 1.0)
    >>> piecewise_interpolate(stops, 11.0)
    (0.0, 0.0, 1.0)
    >>> piecewise_interpolate(stops, -1.0)
    (1.0, 0.0, 0.0)
    """
    stops = stops[:] # copy for sorting
    stops.sort()
    values, colours = zip(*stops)
    if t <= values[0]: return colours[0]    # below range
    if t >= values[-1]: return colours[-1]  # above range
 
    # find which "bin" t falls in
    bin = 0
    while t > values[bin]: bin += 1
 
    # normalize t in the interval (0,1]
    normalized_t = (t - values[bin-1]) / (values[bin] - values[bin-1])
 
    return interpolate(colours[bin-1], colours[bin], normalized_t)
 
def _test():
    import doctest
    doctest.testmod()
 
if __name__ == "__main__":
    _test()