All of the parsers in Text.Parsec.Token politely use lexeme to eat whitespace after a token. Unfortunately for me, whitespace includes new lines, whic
No, it is not. Here is the relevant code.
From Text.Parsec.Token:
lexeme p
= do{ x <- p; whiteSpace; return x }
--whiteSpace
whiteSpace
| noLine && noMulti = skipMany (simpleSpace > "")
| noLine = skipMany (simpleSpace <|> multiLineComment > "")
| noMulti = skipMany (simpleSpace <|> oneLineComment > "")
| otherwise = skipMany (simpleSpace <|> oneLineComment <|> multiLineComment > "")
where
noLine = null (commentLine languageDef)
noMulti = null (commentStart languageDef)
One will notice in the where clause of whitespace that the only only options looked at deal with comments. The lexeme function uses whitespace and it is used liberally in the rest of parsec.token.
The ultimate solution for me was to use a proper lexical analyser (alex). Parsec does a very good job as a parsing library and it is a credit to the design that it can be mangled into doing lexical analysis, but for all but small and simple projects it will quickly become unwieldy. I now use alex to create a linear set of tokens and then Parsec turns them into an AST.