Can parser combinators be made efficient?

后端 未结 4 1722
北恋
北恋 2020-12-22 22:35

Around 6 years ago, I benchmarked my own parser combinators in OCaml and found that they were ~5× slower than the parser generators on offer at the time. I recently rev

4条回答
  •  感动是毒
    2020-12-22 23:12

    I'm currently working on the next version of FParsec (v. 0.9), which will in many situations improve performance by up to a factor of 2 relative to the current version.

    [Update: FParsec 0.9 has been released, see http://www.quanttec.com/fparsec ]

    I've tested Jon's F# parser implementation against two FParsec implementations. The first FParsec parser is a direct translation of djahandarie's parser. The second one uses FParsec's embeddable operator precedence component. As the input I used a string generated with Jon's OCaml script with parameter 10, which gives me an input size of about 2.66MB. All parsers were compiled in release mode and were run on the 32-bit .NET 4 CLR. I only measured the pure parsing time and didn't include startup time or the time needed for constructing the input string (for the FParsec parsers) or the char list (Jon's parser).

    I measured the following numbers (updated numbers for v. 0.9 in parens):

    • Jon's hand-rolled parser: ~230ms
    • FParsec parser #1: ~270ms (~235ms)
    • FParsec parser #2: ~110ms (~102ms)

    In light of these numbers, I'd say that parser combinators can definitely offer competitive performance, at least for this particular problem, especially if you take into account that FParsec

    • automatically generates highly readable error messages,
    • supports very large files as input (with arbitrary backtracking), and
    • comes with a declarative, runtime-configurable operator-precedence parser module.

    Here's the code for the two FParsec implementations:

    Parser #1 (Translation of djahandarie's parser):

    open FParsec
    
    let str s = pstring s
    let expr, exprRef = createParserForwardedToRef()
    
    let fact = pint32 <|> between (str "(") (str ")") expr
    let term =   chainl1 fact ((str "*" >>% (*)) <|> (str "/" >>% (/)))
    do exprRef:= chainl1 term ((str "+" >>% (+)) <|> (str "-" >>% (-)))
    
    let parse str = run expr str
    

    Parser #2 (Idiomatic FParsec implementation):

    open FParsec
    
    let opp = new OperatorPrecedenceParser<_,_,_>()
    type Assoc = Associativity
    
    let str s = pstring s
    let noWS = preturn () // dummy whitespace parser
    
    opp.AddOperator(InfixOperator("-", noWS, 1, Assoc.Left, (-)))
    opp.AddOperator(InfixOperator("+", noWS, 1, Assoc.Left, (+)))
    opp.AddOperator(InfixOperator("*", noWS, 2, Assoc.Left, (*)))
    opp.AddOperator(InfixOperator("/", noWS, 2, Assoc.Left, (/)))
    
    let expr = opp.ExpressionParser
    let term = pint32 <|> between (str "(") (str ")") expr
    opp.TermParser <- term
    
    let parse str = run expr str
    

提交回复
热议问题