What does syntax directed translation mean?

前端 未结 2 1010
庸人自扰
庸人自扰 2020-12-30 10:18

Can anyone, in simple terms, explain what does \"Syntax Directed Translation\" mean? I started to read the topic from Dragon Book but couldn\'t understand. The Wiki

2条回答
  •  鱼传尺愫
    2020-12-30 10:59

    Actually No. Historically before the Dragon Book there were syntax directed compilers. Attending ACM SEGPlan meeting in the late 1960's I learned of several types of directed translation. Tree directed and graph directed translation were also discussed. I think these got muddled together in the Dragon Book though I have never owned the Dragon Book. My favorite book was Programming Systems and Languages by Saul Rosen. It is a collection of papers on compilers, operating systems and computer systems. I'll try to explain the early syntax directed compiler parser programming languages. The later ones producing trees were combined with tree directed code generating languages.

    Early syntax directed compilers, translated source directly to stack machine code. The Borrows B5000 ALGOL compiler is an example.

    A*(B+C) -> A,B,C,ADD,MPY

    Schorre's META II domain specific parser programming language, compiler compiler, developed in the 1960s is an example of a syntax directed compiler. You can find the original META II paper in the ACM archive. META II avoids left recursion using $ postfix zero or more sequence operator and ( ) grouping.

    EXPR = TERM $('+' TERM .OUT 'ADD'|'-' TERM .OUT 'SUB');
    

    Later Schorre based metalanguage compilers translated to trees using stack based tree transformation operators :<node name> and !<number>.

    EXPR = TERM $(('+':ADD|'-':SUB) TERM!2);
    

    Except for TREEMETA that used [<number>] instead of !<number>. The above EXPR formula is basically the same as the META II EXPR except we have factored operators + and - recognition creating corresponding nodes and pushing the node onto the node stack. Then on recognizing the right TERM the tree constructor !2 creates a tree popping the top 2 parse stack <TERM>s and top node from the node stack to form a tree:

        ADD    or    SUB
       /   \        /   \
    TERM   TERM  TERM   TERM
    

    Tokens were recognized by supplied recognizers .ID .NUMBER and .STRING. Later replaced by token ".." and character class ":" formula in CWIC:

    id .. let $(leter|dgt|+'_');
    

    Tree directed compiler languages were combined with the syntax directed compilers to generate code. The CWIC compiler compiler developed at Systems Development Corporation included a LISP 2 based tree directed generator language. A short paper in CWIC can be found in the ACM archives.

    In the parser programming languages you are programming a type of recursive decent parser. When you get to CWIC all the problems that today are attributed to recursive decent parsers were eliminated. There is no left recursion problem as the $ zero or more construct and programed tree construction eliminated the need of left recursion. You control the tree construction. A loop construct is used to produces a left handed tree and tail recursion a right handed tree. Though parsing formulas may generate no tree at all:

    program = $declarations;
    

    In the above the $ zero or more loop operator preceding declarations specifies that declarations is to be repeatably called as long as it returns success. The input source code being compiled is made up of any positive number of declarations. The declarations formula would then define the types of declarations. You might need external linkages declarations, data declarations, function or procedure code declarations.

    declarations = linkage_decl | data_decl | code_decl;
    

    The types of declarations each being a separate formula. The syntax language controls when semantic processing and code generation occurs. The program and declarations formulas above do not produce trees. They are simply controlling when and what language structure are parsed. These are neither LL oe LR parser sears. The provide unlimited (limited only by available memory) programed backtracking. They provide programed look ahead and peak ahead tests.

    As a last example the following example including token and character class formula illustrates producing both left and right handed trees. Specifically exponentiation using tail recursion.

    assign = id '=' expr ';' :ASSIGN!2 arith_gen[*1];
    expr   = term $(('+':ADD | '-':SUB) term !2);
    term   = factor $(('*':MPY | '//' :REM | '/':DIV) factor!2);
    factor = ( id ('(' +[ arg $(',' arg ]+ ')' :CALL!2 | .EMPTY)
             | number 
             | '(' expr ')'
             )  ('^' factor:EXP!2 | .EMPTY);
    
    bin: '0'|'1';
    oct: bin|'2'|'3'|'4'|'5'|'6'|'7';
    dgt: oct|'8'|'9';
    hex: dgt|'A'|'B'|'C'|'D'|'E'|'F'|'a'|'b'|'c'|'d'|'e'|'f';
    upr: 'A'|'B'|'C'|'D'|'E'|'F'|'G'|'H'|'I'|'J'|'K'|'L'|'M'|
         'N'|'O'|'P'|'Q'|'R'|'S'|'T'|'U'|'V'|'W'|'X'|'Y'|'Z';
    lwr: 'a'|'b'|'c'|'d'|'e'|'f'|'g'|'h'|'i'|'j'|'k'|'l'|'m'| 
         'n'|'o'|'p'|'q'|'r'|'s'|'t'|'u'|'v'|'w'|'x'|'y'|'z';
    alpha:    upr|lwr;
    alphanum: alpha|dgt;
    
    number .. dgt $dgt MAKENUM[];
    id .. alpha $(alphanum|+'_');
    

提交回复
热议问题