Compiler Construction: Explicit Parse trees

问题

How can a compiler do without constructing an explicit parse tree? What are the benefits and drawbacks of explicit parse tree construction?

I know that compiler can do construction without explicit parse tree by using SDT and running the semantics associated with it during parsing. But i want to know the benefits and drawbacks of the explicit parse tree construction.

回答1:

Im a bit of a noob so please be paitient with me...thx...

But to answer your question, a recursive-decent compilation (without a parse tree) can only be done in the simplest cases where theres no forward references and a symbol is only valid from the point of its declaration and not its entire scope.

Obviously it wont work with a language like java. If theres forward references, then at least two passes are required, and three passes will be needed if theres overloaded functions on top of that like there is in java (or if you know how to do it in less than three then please enlighten us). To this end we build a parse tree.

The simplest parse tree node might look something like this (disclaimer: this is not real code).

package compiler;
import java.util.ArrayList;
import scanner.Token;
import scanner.TokenSet;

class Production
{
   Token leading;      // the first token in the production
   int productionID;   // a unique integer that identifies the production
   ArrayList<Production> childNodes;   // duh
   Production mother;  // mother node (may be null)

   public Production (Token leading, int productionID)
   {
      this.leading      = leading;
      this.productionID = productionID;
      childNodes        = new ArrayList<Production>();
   }

   public void append (Production child)   // add a new child node
   {
      child nodes.add(child);
      child.mother = this;
   }

   public abstract void build1 (TokenSet follow, TokenSet anchor);   // implements pass 1
   public abstract void build2 ....

}

But a much stronger approach is to derive a new subclass on each production and declare the child nodes as field variables. We can then eliminate the productionID and use instanceof checks instead. You can even implement a symbol interface on a given subclass of nodes that defines a symbol and insert the node directly into the symbol table; and a production that defines a nested scope can have its own symbol table too (i wont do that here). The point is that this way both the syntactic and the semantic analysis can be intergrated into the parse tree structure, and even the final translation too. The only down side is those horrible java interfaces :lol:

for example, we could declare a Modula-2 header file as:

// i wont bother with imports since this isnt real code

class DefinitionModule extends Production
{
   Identifier name;

   ArrayList<ImportClause> importClauses;
   ArrayList<ExportClause> exportClauses;
   ArrayList<Production>   itemList;   // CONST-,TYPE-, & VAR- declarators & function headers

   public DefinitionModule()   // no more productionID
   {
      super(lastTokenRead());   // always sits on DEFINITION
      importClauses = new ArrayList<ImportClause>;

   }


   // build()
   //
   // DefinitionModule ::= DEFINITION MODULE Identifier ";" {ImportClause}{ExportClause}{HeaderItem} END Identifier
   //
   // where HeaderItem ::= ConstDeclarator | TypeDeclarator | VarDeclator | ProcedureHeader.
   // Identifier, ImportClause, & ExportClause below are all derived from
   // Production, above



   public void build (TokenSet follow, TokenSet anchor)
   {
      Scanner.getToken();   // skip the DEFINITION
      Scanner.expectToken(Token.ID_MODULE);    // make sure MODULE is there & then skip it
      name = name.build(new TokenSet(Token.ID_SEMICOLON));
      expectToken(Token.ID_SEMICOLON);
      while (lastTokenRead()==Token.ID_IMPORT || lastTokenRead()==Token.ID_FROM)
      {
         ImportClause IC = new ImportClause(lastTokenRead());
         importClauses.add(IC.build(new TokenSet(Token.ID_SEMICOLON));
         Scanner.expectToken(Token.ID_SEMICOLON);
      }

      while (lastTokenRead()==Token.ID_EXPORT)
      {
         ExportClause XC = new ExportClause(lastTokenRead());
         exportClauses.add(XC.build(new TokenSet(Token.ID_SEMICOLON));
         Scanner.expectToken(Token.ID_SEMICOLON);
      }

      // etc, etc, etc
   }
}

if you do it like that, the compiler will build itself around the features of the language rather than the traditional passes of a compiler.

good luck...

来源：https://stackoverflow.com/questions/47830814/compiler-construction-explicit-parse-trees

标签

compiler-construction

parse-tree