How many passes over the code does gcc use?

问题

Specifically for C and C++, how many passes are used by default?
Does this number change depending on the level of optimization used? (it should)
Can it be changed directly?

I was searching for this information in http://gcc.gnu.org/, but googling using site:http://gcc.gnu.org/ did not yield anything.

Any pointers to any documentation about this will also be helpful.

By pass I meant a pass over the original representation of the source code only and not the multiple pass definition suggested by Wikipedia.

回答1:

Passes and Files of the Compiler might be the closest thing to what you're looking for.

回答2:

As others pointed out above, modern compilers do only one single pass at the parsing stage and then multiple ones at later stages using an internal representation (usually trees or other in-memory graph-like data structure).

Concretely GCC use this approach. See: https://gcc.gnu.org/onlinedocs/gccint/Parsing-pass.html#Parsing-pass

回答3:

I've never heard of a compiler passing multiple times over the textual representation (excepted if you count the preprocessor as one pass). Even when compilers had multiple passes communicating by files, the files contained an intermediate representation (serialized AST + symbol table).

Assemblers on the other hand routinely did two (or more) passes over the source code. Their preprocessor often allows to do things specifically on one pass, allowing to play some more or less dirty trick.

回答4:

In gcc, basically, there are two types of passes, namely : gimple, rtl. In gcc 4.6.2 total unique passes are 207. Yes total number of passes on a given program depend on the optimization level. And some of these passes are taken more then one time. If anyone want to go through these passes, go through the passes.c file in gcc source code. Path for passes.c in gcc 4.6.2 : gcc source -> gcc -> passes.c

Yes you can change the number of passes by adding your passes as dynamic plugin in gcc.

回答5:

From what I was told by somebody in my compiler design class, gcc does a single pass, whereas other compilers like those used by Visual Studio (default) use two passes. This is why you must forward-declare classes in c++ if you are using them in a circular fasion.

Class A {
   B* b; 
}

Class B {
   A* a;
}

C# and other languages do not require this, since the first pass builds the references and the second pass compiles.

But then again I'm not expert in compilers.

回答6:

Exactly one. I don't see any meaningful reason for any modern compiler to make more than one pass over the source code, if by "code" you mean the original textual representation of program's source. The whole point of that single pass is to convert the source code into some internal representation, which will be used for further analysis. That internal representation is no longer required to have any linear structure and/or no longer required to be restricted to sequential processing only, which means that the notion of "making a pass" over it is simply no longer applicable.

If this answer doesn't satisfy you, you should probably provide a more precise explanation of what you define as a "pass" over the source code.

回答7:

Your definition of multi-pass seems to be the old one, stemming from the times where (a representation of) whole program sources just didn't fit into the available memory. These times are gone now and I know not a single, current multi-pass (old definition) compiler.

In the German Wikipedia entry for Compiler, both definitions are given: http://de.wikipedia.org/wiki/Compiler

Multi-pass-Compiler

Bei diesem Compilertyp wird der Quellcode in mehreren Schritten in den Zielcode übersetzt. In den Anfangszeiten des Compilerbaus wurde der Übersetzungsprozess hauptsächlich deshalb in mehrere Durchläufe zerlegt, weil die Kapazität der Computer oft nicht ausreichte, um den vollständigen Compiler und das zu übersetzende Programm gleichzeitig im Hauptspeicher zu halten. Heutzutage dient ein Multi-pass-Compiler vor allem dazu, Vorwärtsreferenzen (Deklaration eines Bezeichners nach dessen erster Verwendung) aufzulösen und aufwendige Optimierungen durchzuführen.

回答8:

Do you mean passes over the source code? Just once. That's called the "tokenization" or "lexical analysis" phase, or, more broadly, "parsing".

Do you mean phases in the compiler? There are several. The term "pass" is really more of an old assembler concept than a compiler concept these days, and even then it is only used roughly. The term "pass" doesn't have a single definition.

Compilers are broken down into "phases". Read the intro to any compiler textbook. It will explain the phases (there are about a dozen logical phases), and GCC follows the textbook models pretty faithfully. Some phases are typically combined into a single "pass", others are separate "passes".

The pass concept is really not as useful in terms of discussing compilers as the phase concept is.

来源：https://stackoverflow.com/questions/3111859/how-many-passes-over-the-code-does-gcc-use

标签

c++

gcc

compiler-construction