Parsing annotated postfix operators with Haskell
I want to show you a neat trick method when constructing parsers with Haskell and Parsec.
While writing a parser for CCS, a process description language (don’t worry too much about it), I found I needed to parse constructs with a suffix operator. To provide a little context, the relevant grammar is this.
P ::= 0 | a.P | ... | P[f] | P \ S
where … indicates I’ve omitted some irrelevant stuff. Here, P is a process - the entity CCS is meant to describe. a is a so-called action label, just think of it as a generic label. a.P stands for a process that can perform an a action and then become the process specified after the dot. 0 (zero) stands for a special process that cannot do anything, the deadlocked process.
f is a mapping from such labels to labels, called a relabelling and is commonly specified with a list of pairs b/a,d/c,... which indicates that a maps to b, c to d etc. S is a set of such labels, represented in the normal manner with curly braces {a,b,c,...}. The construct P \ S is called a restriction, informally it means P cannot perform any of the actions in S.
In summary,
a.a.0is a process that can do twoaactions and then stopP [b/a]is whateverPis, except anyaperformed byPwill be changed to abP \ {a,b}is whateverPis, exceptaorbactions are not allowed
Now, [f] and \ S can be considered postfix unary operators on processes, with some additional structure, the f and the S. Since they are postfix there is no need to define their relative precedence, their order is enough. I.e.
a.0 [b/a] \ {b} [d/c,e/d] \ {e,g}
is unambiguous. Our Haskell ADT to represent the abstract syntax tree looks like this,
type Action = String type Relabeling = Map.Map Action Action type Restriction = Set.Set Action data Process = Null | Prefix Action Process | ... | Relabel Process Relabeling | Restrict Process Restriction deriving (Eq, Show, Ord)
We want the example above to be parsed into this value (indented for clarity),
Restrict ( Relabel ( Restrict ( Relabel ( Prefix "a" Null ) (Map.fromList [("a","b")]) ) (Set.fromList ["b"]) ) (Map.fromList [("c","d"),("d","e")] ) (Set.fromList ["e", "g"])
My initial attempt at writing a parser for these constructs was ugly and complex. I was trying in one rule to parse a bunch of relabellings and restrictions and construct the appropriate value from the process that had already been parsed, cluttering the rules for other constructs with this logic. I thus went to the best documentation of Haskell, the #haskellirc channel, and asked for opinions. Christophe Poucet (vincenz) showed me the light and pointed out that a parser could well return a function,
April 13th, 2008 15:24 < vincenz> Arnar: what about... 15:24 < vincenz> process_adapter :== relabeling | restriction 15:24 < vincenz> and have those return the type 15:24 < vincenz> Process -> Process
And therein lies the trick. Instead of constructing the correct value, I simply parse a sequence of adapters, each of which can be a relabeling or a restriction, in each step returning the function that wraps a process in the relevant constructors.
relabeling :: Parser Relabeling relabeling = do rs <- brackets $ commaSep1 rename return $ Map.fromList rs where rename :: Parser (Action,Action) rename = do s1 <- lexeme action symbol "/" s2 <- lexeme action return (s2,s1) restriction :: Parser Restriction restriction = do symbol "\\" ss <- braces $ commaSep1 action return $ Set.fromList ss process_adapter :: Parser (Process -> Process) process_adapter = do restr <- restriction return (\p -> Restrict p restr) <|> do rel <- relabeling return (\p -> Relabel p rel)
Very simple, very clean. Now, parsing a process followed by a sequence of such adapters is trivial. We then feed the process through the adapter functions in order, i.e.
adaptedProcess :: Parser Process adaptedProcess = do p <- simpleProcess adapters <- many process_adapter return $ foldr (flip (.)) id adapters p
I especially like that last line, foldr (flip (.)) id adapters p. Remember adapters has the type [Process -> Process], i.e. a list of adapters. If this list is [f1,f2,...,fN], then
foldr (flip (.)) id adapters
will give us the composition of these in the correct order, namely
\x -> fN ( ... (f2 (f1 (id x))) ...)
All that’s left to do is to apply this on the original process and out pops a correctly wrapped one.
I think this is very neat and it showcases the power of first-class functions. If you don’t see the beauty here, feel absolutely free to write me off as eccentric ;o)
You can find the complete code and a PDF writeup of the CCS project right here.
Amicia Pyn:
Why isn’t there a name for
24 July 2008, 8:12 amfoldr (flip (.)) id? Or is there?Arnar:
Hey Amicia,
Again, with some helps from the folks at #haskell: There is no name for it, but this turns out to be
mconcatfor a monoid calledEndo a. Conceptually that is the monoid fora -> a, i.e. it has unitidand operator(.). Well, almost…mconcat [Endo f, Endo g] `appEndo` xgives you
f (g x)so it is the reverse order of what we want. That is solved by usingDual (Endo a)instead:(appEndo . getDual . mconcat . map (Dual . Endo)) [f, g] xwill give you
g (f x)which is what we want.Examples are courtesy of EvilTerran
24 July 2008, 11:03 ammisterbeebee:
You could call it “stackApply”, since it applies the listed functions in a stack on top of the argument.
24 July 2008, 5:57 pmDaniel Lyons:
Micro-optimization: you could use
24 July 2008, 6:00 pmfoldr1instead, and omitid. :)Arnar:
Daniel, thanks, but that wouldn’t work here as
24 July 2008, 6:54 pmadaptersmay be the empty list.Websites tagged "parsing" on Postsaver:
[...] - Parsing annotated postfix operators with Haskell saved by rideonjp2008-10-16 - When the Passion for Search Technology meets the Logic of Inquiry. [...]
16 October 2008, 9:47 pmRecent URLs tagged Parsers - Urlrecorder:
[...] recorded first by Gaarafan220 on 2008-12-16→ Parsing annotated postfix operators with Haskell [...]
28 December 2008, 12:16 amimobi:
greatings…
Great job. But not enought info. Where can i read more?…
5 January 2009, 5:18 am