How to further improve error messages in Scala parser-combinator based parsers?

江枫思渺然 提交于 2019-12-22 05:07:00

问题


I've coded a parser based on Scala parser combinators:

class SxmlParser extends RegexParsers with ImplicitConversions with PackratParsers {
    [...]
    lazy val document: PackratParser[AstNodeDocument] =
        ((procinst | element | comment | cdata | whitespace | text)*) ^^ {
            AstNodeDocument(_)
        }
    [...]
}
object SxmlParser {
    def parse(text: String): AstNodeDocument = {
        var ast = AstNodeDocument()
        val parser = new SxmlParser()
        val result = parser.parseAll(parser.document, new CharArrayReader(text.toArray))
        result match {
            case parser.Success(x, _) => ast = x
            case parser.NoSuccess(err, next) => {
                tool.die("failed to parse SXML input " +
                    "(line " + next.pos.line + ", column " + next.pos.column + "):\n" +
                    err + "\n" +
                    next.pos.longString)
            }
        }
        ast
    }
}

Usually the resulting parsing error messages are rather nice. But sometimes it becomes just

sxml: ERROR: failed to parse SXML input (line 32, column 1):
`"' expected but `' found
^

This happens if a quote characters is not closed and the parser reaches the EOT. What I would like to see here is (1) what production the parser was in when it expected the '"' (I've multiple ones) and (2) where in the input this production started parsing (which is an indicator where the opening quote is in the input). Does anybody know how I can improve the error messages and include more information about the actual internal parsing state when the error happens (perhaps something like a production rule stacktrace or whatever can be given reasonably here to better identify the error location). BTW, the above "line 32, column 1" is actually the EOT position and hence of no use here, of course.


回答1:


I don't know yet how to deal with (1), but I was also looking for (2) when I found this webpage:

https://wiki.scala-lang.org/plugins/viewsource/viewpagesrc.action?pageId=917624

I'm just copying the information:

A useful enhancement is to record the input position (line number and column number) of the significant tokens. To do this, you must do three things:

  • Make each output type extend scala.util.parsing.input.Positional
  • invoke the Parsers.positioned() combinator
  • Use a text source that records line and column positions

and

Finally, ensure that the source tracks positions. For streams, you can simply use scala.util.parsing.input.StreamReader; for Strings, use scala.util.parsing.input.CharArrayReader.

I'm currently playing with it so I'll try to add a simple example later




回答2:


In such cases you may use err, failure and ~! with production rules designed specifically to match the error.



来源:https://stackoverflow.com/questions/2906674/how-to-further-improve-error-messages-in-scala-parser-combinator-based-parsers

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!