Recursive descent parser implementation

前端 未结 3 1858
礼貌的吻别
礼貌的吻别 2020-12-14 10:11

I am looking to write some pseudo-code of a recursive descent parser. Now, I have no experience with this type of coding. I have read some examples online but they only work

3条回答
  •  萌比男神i
    2020-12-14 10:34

    This is not the easiest grammar to start with because you have an unlimited amount of lookahead on your first production rule:

    S -> if E then S | if E then S else S |  begin S L | print E
    

    consider

    if 5 then begin begin begin begin ...
    

    When do we determine this stupid else?

    also, consider

    if 5 then if 4 then if 3 then if 2 then print 2 else ...
    

    Now, was that else supposed to bind to the if 5 then fragment? If not, that's actually cool, but be explicit.

    You can rewrite your grammar (possibly, depending on else rule) equivalently as:

    S -> if E then S (else S)? | begin S L | print E
    L -> end | ; S L
    E -> i
    

    Which may or may not be what you want. But the pseudocode sort of jumps out from this.

    define S() {
      if (peek()=="if") {
        consume("if")
        E()
        consume("then")
        S()
        if (peek()=="else") {
          consume("else")
          S()
        }
      } else if (peek()=="begin") {
        consume("begin")
        S()
        L()
      } else if (peek()=="print") {
        consume("print")
        E()
      } else {
        throw error()
      }
    }
    
    define L() {
      if (peek()=="end") {
        consume("end")
      } else if (peek==";")
        consume(";")
        S()
        L()
      } else {
        throw error()
      }
    }
    
    define E() {
      consume_token_i()
    }
    

    For each alternate, I created an if statement that looked at the unique prefix. The final else on any match attempt is always an error. I consume keywords and call the functions corresponding to production rules as I encounter them.

    Translating from pseudocode to real code isn't too complicated, but it isn't trivial. Those peeks and consumes probably don't actually operate on strings. It's far easier to operate on tokens. And simply walking a sentence and consuming it isn't quite the same as parsing it. You'll want to do something as you consume the pieces, possibly building up a parse tree (which means these functions probably return something). And throwing an error might be correct at a high level, but you'd want to put some meaningful information into the error. Also, things get more complex if you do require lookahead.

    I would recommend Language Implementation Patterns by Terence Parr (the guy who wrote antlr, a recursive descent parser generator) when looking at these kinds of problems. The Dragon Book (Aho, et al, recommended in a comment) is good, too (it is still probably the dominant textbook in compiler courses).

提交回复
热议问题