libclang: missing some statements in the AST?

☆樱花仙子☆ 提交于 2020-01-03 17:09:13

问题


I've wrote a test program (parse_ast.c) to parse a c source file (tt.c) to see how libclang works, the output is the hierarchical structure of the AST:

Here is the test file:

/* tt.c */                                    // line 1
#include <unistd.h>
#include <stdio.h>

typedef ssize_t (*write_fn_t)(int, const void *, size_t);

void indirect_write(write_fn_t write_fn) {    // line 7
    (*write_fn)(1, "indirect call\n", 14);
}

void direct_write() {                         // line 11
    write(1, "direct call\n", 12);            // line 12 mising in the ast?
}

int main() {                                  // line 15
    direct_write();
    indirect_write(write);                    // line 17 missing in the ast?

    return 0;
}

The output shows something like this:

 ...
 ...
 inclusion directive at tt.c (2, 1) to (2, 20)
 inclusion directive at tt.c (3, 1) to (3, 19)
 TypedefDecl at tt.c (5, 1) to (5, 57)
 TypeRef at tt.c (5, 9) to (5, 16)
 ParmDecl at tt.c (5, 31) to (5, 35)
 ParmDecl at tt.c (5, 36) to (5, 49)
 ParmDecl at tt.c (5, 50) to (5, 56)
 FunctionDecl at tt.c (7, 1) to (9, 2)
 ParmDecl at tt.c (7, 21) to (7, 40)
  TypeRef at tt.c (7, 21) to (7, 31)
 CompoundStmt at tt.c (7, 42) to (9, 2)
  CallExpr at tt.c (8, 5) to (8, 42)
   UnexposedExpr at tt.c (8, 5) to (8, 16)
    ParenExpr at tt.c (8, 5) to (8, 16)
     UnaryOperator at tt.c (8, 6) to (8, 15)
      UnexposedExpr at tt.c (8, 7) to (8, 15)
       DeclRefExpr at tt.c (8, 7) to (8, 15)
   IntegerLiteral at tt.c (8, 17) to (8, 18)
   UnexposedExpr at tt.c (8, 20) to (8, 37)
    UnexposedExpr at tt.c (8, 20) to (8, 37)
     StringLiteral at tt.c (8, 20) to (8, 37)
   IntegerLiteral at tt.c (8, 39) to (8, 41)
 FunctionDecl at tt.c (11, 1) to (13, 2)
 CompoundStmt at tt.c (11, 21) to (13, 2)        <- XXX no line 12?
 FunctionDecl at tt.c (15, 1) to (20, 2)
 CompoundStmt at tt.c (15, 12) to (20, 2)
  CallExpr at tt.c (16, 5) to (16, 19)
   UnexposedExpr at tt.c (16, 5) to (16, 17)
    DeclRefExpr at tt.c (16, 5) to (16, 17)      <- XXX no line 17?
  ReturnStmt at tt.c (19, 5) to (19, 13)
   IntegerLiteral at tt.c (19, 12) to (19, 13)

We can see that the three functions (direct_write at line 7/indirect_write at line 11/main at line 15) are there, most statements can be found in the AST, but i can not find anything that are representing the statements in line 12 and line 17. Does anyone know the reason?

I'm on debian 2.6.32 squeeze, tested both on clang 3.1 and 3.2 (compiled from source).

Here is the program parse_ast.c:

#include <stddef.h>
#include <stdio.h>
#include <clang-c/Index.h>

enum CXChildVisitResult visit_fn(CXCursor cr, CXCursor parent,
        CXClientData client_data) {

    unsigned depth;
    unsigned line, column, offset;
    enum CXCursorKind kind;
    CXSourceRange extent;
    CXSourceLocation start, end;
    CXString kind_spelling, filename;
    CXFile file;

    depth = (unsigned)client_data;

    // print cursor kind
    kind = clang_getCursorKind(cr);
    kind_spelling = clang_getCursorKindSpelling(kind);
    fprintf(stdout, "%*s%s at", depth, " ", clang_getCString(kind_spelling));
    clang_disposeString(kind_spelling);

    // get extent
    extent = clang_getCursorExtent(cr);
    start = clang_getRangeStart(extent);
    end = clang_getRangeEnd(extent);

    // print start position
    clang_getExpansionLocation(start, &file, &line, &column, &offset);
    filename = clang_getFileName(file);
    fprintf(stdout, " %s (%u, %u) to", clang_getCString(filename), line,
            column);
    clang_disposeString(filename);

    // print end position
    clang_getExpansionLocation(end, &file, &line, &column, &offset);
    fprintf(stdout, " (%u, %u)\n", line, column);

    // recursive
    clang_visitChildren(cr, visit_fn, (CXClientData)(depth + 1));

    return CXChildVisit_Continue;

}

int main(int argc, const char * const *argv) {
    CXIndex Index = clang_createIndex(0, 0);
    CXTranslationUnit TU = clang_parseTranslationUnit(Index, NULL,
            argv, argc, 0, 0, CXTranslationUnit_DetailedPreprocessingRecord);

    clang_visitChildren(clang_getTranslationUnitCursor(TU),
            visit_fn, 0);
    clang_disposeTranslationUnit(TU);
    clang_disposeIndex(Index);

    return 0;
}

Update:

the problem is due to missing a header file stddef.h, it's answered in libclang's mail list http://clang-developers.42468.n3.nabble.com/libclang-missing-some-statements-in-the-AST-td4029641.html


回答1:


Check the diagnostics generated by clang_parseTranslationUnit() - even if errors are encountered, an AST is generated, but clearly it can't be guaranteed to be meaningful.

I found that commenting out the #include lines resulted in compilation errors, but an AST was generated which resembled yours (specifically, line 17 was missing).

Replacing the #include lines with typedefs for size_t and ssize_t (as int) resulted in a compilation warning about the implicit declaration of write(), but the AST included line 17.

Hence I assume there's a problem in your header files, which the diagnostics should reveal. Diagnostics can be retrieved by e.g.

for (unsigned I = 0, N = clang_getNumDiagnostics(TU); I != N; ++I) { 
    CXDiagnostic Diag = clang_getDiagnostic(TU, I);
    CXString String = clang_formatDiagnostic(Diag, clang_defaultDiagnosticDisplayOptions());
    fprintf(stderr, "%s\n", clang_getCString(String));
    clang_disposeString(String);
}



回答2:


I'm using libclang to parse and optimize c code, but parsing c source files with your code i can´t see the CXCursor_BinaryOperator inside a CompoundStmt

For example

void OCTS_C_TimerMiliseconds_reset_Timers(OCTS_outC_C_TimerMiliseconds_Timers *outC)
{
    outC->init = kcg_true;
    /* 1 */ OCTS_Sign_INT_reset_Math(&outC->_1_Context_1);
    /* 2 */ OCTS_Sign_INT_reset_Math(&outC->Context_2);
    /* 1 */ OCTS_FallingEdge_reset_Edge(&outC->Context_1);
} 

The result is:

FunctionDecl at s.cpp (10, 1) to (17, 2) OCTS_C_TimerMiliseconds_reset_Timers 
ParmDecl at s.cpp (11, 3) to (11, 44) outC 
 TypeRef at s.cpp (11, 3) to (11, 38) OCTS_outC_C_TimerMiliseconds_Timers 
CompoundStmt at s.cpp (12, 1) to (17, 2)  


来源:https://stackoverflow.com/questions/14250754/libclang-missing-some-statements-in-the-ast

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!