Visualizing gameplaying algorithms
For the impatient: Scroll down for the video.
The 2008 world championship in General Game Playing is currently on. AI bots from various universities around the world compete in playing games they’ve never seen before. The bots act as web servers to which a game master connects and uploads a game description, a first-order logic description of game rules in a special language called GDL. The games can be single- or multi-player, but must be perfect information games.
I am proud to say that the world champions of last year (2007) are my friends and colleagues at RU, Hilmar Finnsson and Yngvi Björnsson. Their team, which now includes Gylfi Þór Guðmundsson, and their bot CADIAPlayer, are quite busy these days defending the title.
Update July 18th: The RU team succeeded in defending the title and CADIA Player is the 2008 General Game Playing champion.
CADIAPlayer relies on an implementation of the UCT algorithm. I’m not qualified to explain it here but if you are interested, Hilmars M.Sc. thesis has all the gory details. This weekend, me, Hilmar and Stefán Freyr did a little experiment with a very interesting piece of software called Ubigraph. Ubigraph is basically just a application that fires up a 3D window and starts listening for XML-RPC requests over HTTP. Through its XML-RPC API, client applications can create graph nodes, edges and alter their styles and Ubigraph lays them out in 3D with a powerful multi-core algorithm and displays it in the window. The user can then zoom and move around the graph. Updates to the graph are rendered in real-time.
To see the CADIAPlayer in action, Hilmar added commands to the code to log to a file whenever it created new nodes (representing game states), when nodes were deleted and when the player made an actual move. Me and Stefán cobbled together a small Python script, which Hilmar then tweaked some more (source below), that eats this log and sends the relevant commands to Ubigraph at a fixed rate. The result is a beautiful display of how the algorithm works and shows how it explores areas of the search graph that look interesting before deciding on a move.
Of course, words are useless here so here’s a video of it in action, playing a game of Othello. The video is quite long (18 minutes) so I recommend skipping around in it, the action starts after 13 seconds and the middle and the ending are also interesting. At about 3:50 you start to see loops forming, meaning two different lines of play reach the same state, and the tree actually becomes a graph.
The nodes represent states and edges represent a legal game move. As old nodes are removed from memory, you can see how their child nodes become disconnected, they snap away from the tree (chasing all child nodes for deletion is too expensive) but they are cleaned up in subsequent deletion phases. The pink node represents the current state of the game and the growth in the tree shows where CADIAPlayer is exploring and evaluating possible moves. The numbered nodes represent the actual playout so far and the numbers are the players confidence values, the closer they are to 100, the more optimistic it is about winning the game.
The log file is quite simple, the first character of a line determines the entry type. For the main purpose we
have two types, "C 123 456" means create a new node with id 123 as child (successor) of node 456, and
"D 123" which simply means delete node 123. There are other commands, but those are the main ones.
The Python program that feeds this to Ubigraph looks like this,
# -*- coding: utf-8 -*- import sys import time import xmlrpclib import itertools import gradient NPS = 100 # nodes per second server_url = 'http://127.0.0.1:20738/RPC2' playercolors = itertools.cycle(['#ff00ff', '#ffd9ff']) weight_colors = [ (0.0, (1.0, 0.0, 0.0)), # red (50.0, (0.0, 0.0, 1.0)), # yellow (100.0, (0.0, 1.0, 0.0)), # green ] def weight2color(w): c = gradient.piecewise_interpolate(weight_colors, w) return '#' + ''.join(map(lambda x: ("%02x" % int(255*x)), c)) if __name__ == '__main__': server = xmlrpclib.Server(server_url) G = server.ubigraph G.clear() current = None interval = 1.0/NPS seen = {} path = set() log = file(sys.argv[1], 'r') for line in log: line = line.strip() if not line: continue params = line.lower().split() cmd = params[0] if 'c' == cmd: new, parent = map(hash, params[1:]) if new in seen: if parent not in seen[new]: G.new_edge(parent, new) seen[new].append(parent) else: seen[new] = [parent] G.new_vertex_w_id(new) G.set_vertex_attribute(new, 'color', weight2color(50)) if params[2] == '0': current = new G.set_vertex_attribute(new, 'color', playercolors.next()) path.add(new) else: e = G.new_edge(parent, new) G.set_edge_attribute(e, "oriented", "true") time.sleep(interval) elif 'd' == cmd: if hash(params[1]) not in path: # Uncomment this next line and comment second next one if you want # deleted nodes to be coloured gray instead of being nuked. # G.set_vertex_attribute(hash(params[1]), 'color', '#666666') G.remove_vertex(hash(params[1])) time.sleep(interval) elif 'u' == cmd: n, w = hash(params[1]), float(params[2]) G.set_vertex_attribute(n, 'color', weight2color(w)) G.set_vertex_attribute(n, 'label', "%0.2f" % w) elif 'n' == cmd: new = hash(params[1]) G.set_vertex_attribute(new, 'color', playercolors.next()) current = new path.add(new)
Oh, and that gradient module is one that often comes in handy, here it is for sake of completeness.
# -*- encoding: utf-8 -*- def interpolate(c1, c2, t): """ Interpolates between two colours c1 and c2 by the variable t. t should be in the range [0.0, 1.0]. interpolate(c1,c2,0) will return c1 and interpolate(c1,c2,1) will return c2. interpolate(c1,c2,0.5) will return the color in the middle. The colours c1 and c2 should simply be number triplets (for example RGB values). Example: >>> c1 = (1,2,3) >>> c2 = (11,12,13) >>> t = 0.5 >>> interpolate(c1,c2,t) (6.0, 7.0, 8.0) """ if t <= 0: return c1 if t >= 1: return c2 ipol = lambda a, b, t: (1.0-t)*a + float(t)*b return (ipol(c1[0], c2[0], t), ipol(c1[1], c2[1], t), ipol(c1[2], c2[2], t)) # equivalent: # from itertools import starmap # return tuple(starmap(ipol, zip(c1,c2,(t,t,t)))) def piecewise_interpolate(stops, t): """ Gives a piecewise linear interpolation of values in stops by t. stops should be a list (or an iterable) of paris of the form (value,colour) where colour is a numerical triplet. t should be a real number. The function returns an interpolation of the colours in stops determined by the parameter t. If t falls out of the range of specified values, the appropriate end colours will be returned Examples: >>> stops = [ (0.0, (1.0, 0.0, 0.0)), # red ... (3.3, (1.0, 1.0, 0.0)), # yellow ... (6.7, (0.0, 1.0, 0.0)), # green ... (10.0, (0.0, 0.0, 1.0)),# blue ... ] >>> piecewise_interpolate(stops, 0.0) (1.0, 0.0, 0.0) >>> piecewise_interpolate(stops, 1.0) (1.0, 0.30303030303030304, 0.0) >>> piecewise_interpolate(stops, 5) (0.5, 1.0, 0.0) >>> piecewise_interpolate(stops, 10.0) (0.0, 0.0, 1.0) >>> piecewise_interpolate(stops, 11.0) (0.0, 0.0, 1.0) >>> piecewise_interpolate(stops, -1.0) (1.0, 0.0, 0.0) """ stops = stops[:] # copy for sorting stops.sort() values, colours = zip(*stops) if t <= values[0]: return colours[0] # below range if t >= values[-1]: return colours[-1] # above range # find which "bin" t falls in bin = 0 while t > values[bin]: bin += 1 # normalize t in the interval (0,1] normalized_t = (t - values[bin-1]) / (values[bin] - values[bin-1]) return interpolate(colours[bin-1], colours[bin], normalized_t) def _test(): import doctest doctest.testmod() if __name__ == "__main__": _test()
Árni Hermann:
The visualization of the algorithm is just flat out awesome and crazy. Mad props to you guys.
11 June 2008, 4:17 pmweb design company:
This would be an awesome way to debug time/space leaks in lazily evaluated languages like Haskell. Each node represents a thunk or value, hover the mouse over one to see what it is and/or where it came from. Look for long chains to find your leaks!
11 June 2008, 5:27 pmIngo:
Really nice. I was looking for a similar way of representing the game tree while playing in Palamedes IDE. If you dont mind, I would like to integrate this.
12 June 2008, 8:13 amArnar:
@Ingo: Sure, I don’t mind and I don’t think the others would.
12 June 2008, 9:08 amBjarki:
Hellacool!
12 June 2008, 12:06 pmHrafn:
Hat tip for an awesome visualization! Nice work.
12 June 2008, 12:18 pmrascunho » Blog Archive » links for 2008-06-12:
[...] Visualizing gameplaying algorithms (tags: http://www.hvergi.net 2008 mes5 dia12 at_tecp games gameai visualization blog_post GDL) [...]
12 June 2008, 8:57 pmtrees:
[...] [...]
13 June 2008, 8:59 amGame AI Roundup Week #24 2008: 13 Stories, 4 Quotes, 1 Video, Job — AiGameDev.com:
[...] Visualizing gameplaying algorithms [...]
16 June 2008, 12:17 amTodd V:
Very nifty! Sorry about the autozoom jitter. I have a fix that should make it into version 0.2.5.
best Todd
17 June 2008, 4:40 pmVisualizing a game-playing algorithm:
[...] Arnar Birgisson, Hilmar Finnsson and Stefán Freyr from Reykjavik University put together this visualization of the general game-playing system CADIA-player. For more information, see their post. [...]
17 June 2008, 8:23 pmMoran Atias:
Hi…Thanks for the nice read, keep up the interesting posts about marcia moran..what a nice Thursday .
19 June 2008, 2:04 pmFreyr:
Spectacular. Very informative and revealing way to visualize the game tree in a temporal sense. It might be interesting to try to represent feature characteristics of individual nodes using the node color. Would have to extend the log then I guess. But very cool indeed as is.
24 June 2008, 2:00 pmChris:
101 question
How did you save the visualisation as a movie? I can only save single frames out as tiffs, manually from the mac version.
30 June 2008, 11:19 pmArnar:
Hey Chris,
I used SnapzXPro for the screengrab-video.
8 July 2008, 12:33 amliger:
hey…
everything dynamic and very positively…
2 January 2009, 1:02 am