Pycparser not working on preprocessed code

喜欢而已 提交于 2019-12-04 19:52:33

The error you're getting is:

pycparser.plyparser.ParseError: /usr/lib/gcc/x86_64-linux-gnu/4.8/include/stdarg.h:40:27: before: __gnuc_va_list

The line indicated as causing the error (stdarg.h:40):

typedef __builtin_va_list __gnuc_va_list;

In gcc, __builtin_va_list is, as its name indicates, built in to the compiler. Consequently, no declaration of that type is necessary (or allowed).

It's pretty common for C compilers to use a symbol-table-based technique to parse typenames, since there are a number of ambiguities in the grammar if you cannot distinguish a typename from another identifier. Such a parser will assume that an undeclared identifier is not a typename, and if __builtin_va_list is not a typename, that typedef is a syntax error.

So I suppose that the pyparser grammar you're using doesn't know about gcc builtin types (and why should it?).

Your fakelib seems to be including the same header file. That's not surprising since it is hard to fake stdarg.h; although technically a library header, it is part of the small set of headers which must be provided by the compiler even in a freestanding (no standard library) implementation: <float.h>, <iso646.h>, <limits.h>, <stdalign.h>, <stdarg.h>, <stdbool.h>, <stddef.h>, <stdint.h>, and <stdnoreturn.h> (C11 standard, clause 4, paragraph 6). These must be implemented by the compiler because there is no way an external library can know enough about the nature of the compiled code to properly define them.

Depending on what you require from the pyparsed output, you may be able to workaround this for pyparser by including a definition of __builtin_va_list, such as:

typedef struct __builtin_va_list { } __builtin_va_list;

__builtin_va_list is not the only builtin gcc datatype, although you may not run into the other ones. So you might have to iterate this solution a few times until you achieve whatever it is you are trying to achieve.

As @rici has explained the cause of the error. I'd focus more on how to solve it. I've taken my answer from pycparser author's blog - http://eli.thegreenplace.net/2015/on-parsing-c-type-declarations-and-fake-headers

The idea is that pycparser needs to know what anyheader.h contains so it can properly parse the code. As actually parsing anyheader.h and all the other headers it transitively includes, could be very time consuming and perhaps not required for your task, fakeheaders can be used. A fake anyheader.h will only contain the parts of the original that are necessary for parsing - the #defines and the typedefs.

gcc -nostdinc -E -I/home/rg/pycparser-master/utils/fake_libc_include test.c > testPP.c

The above command preprocess test.c which contains <stdio.h> using fake headers provided with pycparser package. -nostdinc flag is used to block some pre-set system header directories that gcc automatically includes. Now, parsing the preprocessed file, using e.g. below code

import pycparser
pycparser.parse_file('testPP.c')

should work in the most of the cases. If it doesn't make sure you provide all the dependencies for preprocessing. In case, for some headers fakes are not provided, you can fake error causing typedef using #defining e.g. to resolve an error caused by __builtin_va_list, you can try faking it as follows:

gcc -nostdinc -E -D'__builtin_va_list(x)=' -I/home/rg/pycparser-master/utils/fake_libc_include test.c > testPP.c

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!