Scala PackratParsers does not backtrack as it should?

ぃ、小莉子 提交于 2019-12-13 00:35:10

问题


I have the following code for simple parser of logical expressions:

import scala.util.parsing.combinator.RegexParsers
import scala.util.parsing.combinator.PackratParsers


object Parsers extends RegexParsers with PackratParsers

// Entities definition
sealed trait LogicalUnit
case class Variable(name: String) extends LogicalUnit
case class Not(arg: LogicalUnit) extends LogicalUnit
case class And(arg1: LogicalUnit, arg2: LogicalUnit) extends LogicalUnit


import Parsers._

// In order of descending priority
lazy val pattern: PackratParser[LogicalUnit] =
  ((variable) | (not) | (and))

lazy val variable: PackratParser[Variable] =
  "[a-zA-Z]".r ^^ { n => Variable(n) }

lazy val not: PackratParser[Not] =
  ("!" ~> pattern) ^^ { x => Not(x) }

lazy val and: PackratParser[And] =
  ((pattern <~ "&") ~ pattern) ^^ { case a ~ b => And(a, b) }


// Execution
println(Parsers.parseAll(pattern, "!a & !b"))

So, trying to parse a string !a & !b and it fails with

[1.4] failure: string matching regex `\z' expected but `&' found

!a & !b
   ^

It seems that root parser tries to parse a whole string as pattern -> not -> variable and doesn't backtrack when it discovers that !a is not the end yet, so pattern -> and isn't even tried. I thought that using PackratParsers should solve that, but it didn't

What am I doing wrong?


回答1:


I don't think there is any way to make one of these parsers backtrack once it has successfully accepted something. If an alternative succeeds, no other alternative are tried. This behaviour is intrinsic to the packrat parsing method for Parsing Expression Grammars that these combinators implement (as opposed to Context-Free Grammars where the order of alternatives is not relevant and backtracking behaviour depends on the parsing method). That is why the alternatives that may match longer input should be given first.

Regarding the precedence of not versus and, the standard approach is to encode the precedence and associativity of operators in the grammar rules as you would for Context-Free Grammars. Most books on parsing will describe how to do this. You can see one version in the following notes starting at slide 24: http://www.sci.usq.edu.au/courses/CSC3403/lect/syntax-1up.pdf.




回答2:


I don't know the specific reason, but whenever I encountered such a problem with Parsers, I put the order of the parse possibilities from the most complicated to the simplest.

In your case it would be

lazy val pattern: PackratParser[LogicalUnit] = ((and) | (not) | (variable)), which makes your example parse.

The result is however Not(And(Variable(a),Not(Variable(b)))), which might be not what you want.

The reason is that a & !b is a valid pattern, so !a & !b can be parsed starting from not.

To change that, you can introduce parenthesis. This is one simple possibility:

lazy val not: PackratParser[Not] =
  ("!" ~> term) ^^ { x => Not(x) }

lazy val term: PackratParser[LogicalUnit] = 
  variable | "(" ~> and <~ ")" 

Now the result is And(Not(Variable(a)),Not(Variable(b))).



来源:https://stackoverflow.com/questions/26453870/scala-packratparsers-does-not-backtrack-as-it-should

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!