Grammar: difference between a top down and bottom up?

此生再无相见时 提交于 2019-12-06 02:19:17

问题


What is the difference between a top down and bottom up grammar? An example would be awesome.


回答1:


First of all, the grammar itself isn't top-down or bottom-up, the parser is (though there are grammars that can be parsed by one but not the other).

From a practical viewpoint, the main difference is that most hand-written parsers are top-down, while a much larger percentage of machine-generated parsers are bottom-up (though, of course, the reverse is certainly possible).

A top-down parser typically uses recursive descent, which typically means a structure something like this (using typical mathematical expressions as an example):

expression() { term() [-+] expression }
term() { factor() [*/] term() }
factor() { operand() | '(' expression() ')' }

A bottom-up parser work in the reverse direction -- where a recursive descent parser starts from the full expression, and breaks it down into smaller and smaller pieces until it reaches the level of individual tokens, a bottom-up parser starts from the individual tokens, and uses tables of rules about how those tokens fit together into higher and higher levels of the expression hierarchy until it reaches the top level (what's represented as "expression" above).

Edit: To clarify, perhaps it would make sense to add a really trivial parser. In this case, I'll just do the old classic of converting a simplified version of a typical mathematical expression from infix to postfix:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

void expression(void);

void show(int ch) { 
    putchar(ch);
    putchar(' ');
}

int token() { 
    int ch;
    while (isspace(ch=getchar()))
        ;
    return ch;
}

void factor() { 
    int ch = token();
    if (ch == '(') {
        expression();
        ch = token();
        if (ch != ')') {
            fprintf(stderr, "Syntax error. Expected close paren, found: %c\n", ch);
            exit(EXIT_FAILURE);
        }
    }
    else
        show(ch);
}

void term() { 
    int ch;
    factor();
    ch = token();
    if (ch == '*' || ch == '/') {
        term();
        show(ch);
    }
    else
        ungetc(ch, stdin);
}

void expression() {
    int ch;
    term();
    ch = token();
    if (ch == '-' || ch=='+') {
        expression();
        show(ch);
    }
    else 
        ungetc(ch, stdin);
}

int main(int argc, char **argv) {
    expression();
    return 0;
}

Note that the lexing here is pretty stupid (it basically just accepts a single character as a token) and the expressions allowed are quite limited (only +-*/). OTOH, it's good enough to handle an input like:

1+2*(3+4*(5/6))

from which it does produce what I believe is correct output:

1 2 3 4 5 6 / * + * +




回答2:


Afaik it doesn't make any difference for the grammar itself, but it does for the parser.

Wikipedia has a quite lengthy explanation of both bottom-up and top-down parsing.

Generally the (imho) more intuitive way is top-down. You start with the start symbol and apply the transformation rules that fit, while with bottom-up you need to apply transformation rules backwards (which usually created quite a headache for me).



来源:https://stackoverflow.com/questions/3181960/grammar-difference-between-a-top-down-and-bottom-up

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!