I decided to pull out Haskell for my Linguistics 1101 Phrase Structure Rules Assignment. It seemed like a good opportunity to play with these monadic parser combinator things, which sound impressive if nothing else. The result was pleasing, although I’m not sure if my tutor will appreciate it.
It was fun revisiting Haskell, and writing parsers directly using Parsec is certainly a novel alternative to using a Bison-style compiler-compiler. Spirit was similar, but C++ can become so syntactically clunky some of the joy is lost.
I’m not sure whether it was something specific to the Parsec paradigm, my abuse of Parsec, or my ignorance of Haskell and monadic programming in general, but I kept finding myself on the wrong side of Monads and do-expressions. It seems you have to use liftM a lot.
In looking for a generalisation of the liftMn functions I came up with:
foldMLr :: (Monad m) => (t -> a -> a) -> a -> [m t] -> m a -- foldMLr f u xs binds the monads in xs headfirst, and folds their results -- from the right using f and u as the rightmost. foldMLr _ u [] = return u foldMLr f u (x:xs) = do { a <- x ; b <- foldMLr f u xs ; return (f a b) } -- equivalently: -- foldMLr f u (x:xs) = liftM2 f x (foldMLr f u xs)
which is not the same as foldM
but is a generalisation of sequence
, which can be defined as foldMLr (:) []
. I didn’t end up using it in the final parser.
Another issue was that constructing a parse tree (using Data.Tree types) was actually somewhat tedious. I guess Parsec assumes that you want to fold up the result within the parsers.
Also watch out for the change in showErrorMessages
, in ghc it takes some extra initial string arguments that weren’t there in the standalone release.