Haskell Parsec combinator 'many' is applied to a parser that accepts an empty string

无人久伴 提交于 2019-12-22 07:06:13

问题


import Text.ParserCombinators.Parsec

delimiter :: Parser ()
delimiter = do char '|'
               return ()
          <?> "delimiter"


eol :: Parser ()
eol = do oneOf "\n\r"
         return ()
    <?> "end of line"

item :: Parser String
item = do entry <- manyTill anyChar (try eol <|> try delimiter <|> eof)
          return entry

items :: Parser [String]
items = do result <- many item
           return result

When I run parseTest items "a|b|c" with the code above I get the following error:

*** Exception: Text.ParserCombinators.Parsec.Prim.many: 
combinator 'many' is applied to a parser that accepts an empty string.

I believe it has something to do with eof and many item, if I remove eof, then I can get it to work as long as the line does not end in eof, which makes it kind of useless.

I realize I could just use sepBy but what I am interested in is why this code does not work and how to make it work.


回答1:


A parser like many can indeed not be applied to parsers that accept the empty string, because this makes the grammar ambiguous: How often do you parse the empty string? Choosing different numbers can lead to different parse results ...

You are right to assume that many item is the problematic combination. An item is defined in terms of manyTill. (Excursion: Btw, you can simplify manyTill to

item :: Parser String
item = manyTill anyChar (eol <|> delimiter <|> eof)

No need for the do or the return, and no need for try, because each of the three parsers expect different first tokens.) The parser manyTill thus parses an arbitrary number of characters, followed by either an eol, a delimiter, or an eof. Now, eol and delimiter actually consume at least one character when they succeed, but eof doesn't. The parser eof succeeds at the end of the input, but it can be applied multiple times. For example,

ghci> parseTest (do { eof; eof }) ""
()

It doesn't consume any input, and is thereby making it possible for item to succeed on the empty string (at the end of your input), and is thereby causing the ambiguity.

To fix this, you can indeed rewrite your grammar and move to something like sepBy, or you can try to distinguish normal items (where eof isn't allowed as end-marker) from the final item (where eof is allowed).




回答2:


That's because there is an infinite number of ways to parse an empty string as many emptyString



来源:https://stackoverflow.com/questions/19899931/haskell-parsec-combinator-many-is-applied-to-a-parser-that-accepts-an-empty-st

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!