“undefined value, reference not allowed” workaround

问题

I'm looking for some clarification on the compiler error message The value of xyz is undefined here, so reference is not allowed., together with the do-notation. I did not manage to generalize the example enough, all I can give is the concrete example where I stumbled upon this behaviour. Sorry for that.

Using purescript-parsing, I want to write a parser which accepts nested multiline-comments. To simplify the example, each comment starts with (, ends with ) and can contain either an a, or another comment. Some examples: (a) and ((a)) accepted, (), (a or foo get rejected.

The following code results in the error The value of comment is undefined here, so reference is not allowed. on the line content <- string "a" <|> comment:

comment :: Parser String String
comment = do
  open <- string "("
  content <- commentContent
  close <- string ")"
  return $ open ++ content ++ close

commentContent :: Parser String String
commentContent = do
  content <- string "a" <|> comment
  return content

I can get rid of the error by inserting a line above content <- string "a" <|> comment which as far as I understand it does not change the resulting parser at all:

commentContent :: Parser String String
commentContent = do
  optional (fail "")
  content <- string "a" <|> comment
  return content

The questions are:

What is happening here? Why does the extra line help?
What is a non-hacky way to get the code to compile?

回答1:

The reason the second case works becomes more apparent if you desugar the do manually:

commentContent :: Parser String String
commentContent =
  optional (fail "") >>= \_ ->
    string "a" <|> comment >>= \content ->
      return content

When defined this way, the comment reference is inside a lambda, so therefore is not evaluated during the definition of commentContent.

As for the non-hacky solution, it would involve some use of fix I imagine. fix allows you to define recursive parsers like:

myParser = fix \p -> do
   ... parser definition ....

Where p is a reference to myParser that you can use within itself. As for the case here where you have mutually recursive parsers, I'm not exactly sure how best to solve it with fix, there are a few options I can think of, but none are particularly elegant. Perhaps something like this:

parens :: Parser String String -> Parser String String
parens p = do
  open <- string "("
  content <- p
  close <- string ")"
  return $ open ++ content ++ close

comment :: Parser String String
comment = parens commentContent

commentContent :: Parser String String
commentContent = fix \p -> do
  content <- string "a" <|> parens p
  return content

It might be easier to use a trick similar to the strange do case and insert a Unit -> in front of one of the parsers, so you can delay the recursive reference until the Unit value is provided, something like:

comment :: Parser String String
comment = do
  open <- string "("
  content <- commentContent unit
  close <- string ")"
  return $ open ++ content ++ close

commentContent :: Unit -> Parser String String
commentContent _ = do
  content <- string "a" <|> comment
  return content

来源：https://stackoverflow.com/questions/36984245/undefined-value-reference-not-allowed-workaround

标签

purescript