Easiest way to work with intermediate format

安稳与你 提交于 2019-12-08 09:56:44

问题


A tool I'm working on needs to take the intermediate format generated by the compiler, add some code to it and then give that modified intermediate code to the backend of the compiler to generate the final code.

By doing a little research on gcc, I found that the GIMPLE format is easy to understand, but I'm not sure about the complexity of modifying the GIMPLE code and don't know of any way to restart the compilation from there except using plugins and adding your own pass. Also people warned me that documentation is scarce and the going gets tough when you are stuck when working with gcc.

Another option is to use the LLVM bytecode. But I have never worked with LLVM, so don't know how complex my task would be with LLVM. There maybe even better options I'm not aware of. Therefore, I just want to know the best option. My preferences are the following.

  • Platform independency
  • Easy to use
  • Well documented
  • More people using it, so more help available

回答1:


As you probably already know, MELT is a high-level domain specific language to extend GCC. You can work on Gimple (etc...) quite easily with it (and also modify the internal representations in Gcc)

However, extending GCC means some work, because the Gimple (and also the Tree) representations (with others, eg Edges..) are complex...




回答2:


According to your description, LLVM fits the bill perfectly. One of its main aims is to serve as a flexible library and framework for manipulation of IR code. The countless optimization, transformation and analysis "passes" it comes with serve both as a proof and as great examples. IMO LLVM also answers the 4 points you list in your question very well:

  • Platform independency: LLVM runs on the major platforms (Linux, Mac and Windows) and knows how to generate code for many CPU types.
  • Easy to use: IR and compiler backends are a difficult area to hack, but as far as such things go LLVM is a good candidate since it's a relatively new project, well documented, with a very clean code base.
  • Well documented: knock yourself out
  • More people using it: very active development and usage, with some corporations already heavily invested in it (most notably Apple and Google).



回答3:


This may not be helpful at all, but I wondered about the processing passes of gcc. The abridged (pared down mostly to exec/fork calls) output from strace -f -o gcc.strace gcc -c tstamp.c:

7141  execve("/usr/bin/gcc", ["gcc", "-c", "tstamp.c"], [/* 52 vars */]) = 0
7141  open("/tmp/ccqzaCI4.s", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = 3
7141  close(3)                          = 0
7141  vfork( <unfinished ...>
7142  execve("/usr/libexec/gcc/i686-redhat-linux/4.6.1/cc1", ["/usr/libexec/gcc/i686-redhat-lin"..., "-quiet", "tstamp.c", "-quiet", "-dumpbase", "tstamp.c", "-mtune=generic", "-march=i686", "-auxbase", "tstamp", "-o", "/tmp/ccqzaCI4.s"], [/* 55 vars */] <unfinished ...>
7141  <... vfork resumed> )             = 7142
7141  waitpid(7142,  <unfinished ...>
7142  <... execve resumed> )            = 0
7142  open("tstamp.c", O_RDONLY|O_NOCTTY|O_LARGEFILE) = 3
7142  close(3)                          = 0
7142  open("/tmp/ccqzaCI4.s", O_RDWR|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
7142  open("/usr/include/stdio.h", O_RDONLY|O_NOCTTY|O_LARGEFILE) = 4
... (opens and closes every include file)
7142  close(4)                          = 0
7142  close(3)                          = 0
7142  exit_group(0)                     = ?
7141  <... waitpid resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 7142
7141  vfork( <unfinished ...>
7143  execve("/usr/bin/as", ["as", "--32", "-o", "tstamp.o", "/tmp/ccqzaCI4.s"], [/* 55 vars */] <unfinished ...>
7141  <... vfork resumed> )             = 7143
7141  waitpid(7143,  <unfinished ...>
7143  <... execve resumed> )            = 0
7143  unlink("tstamp.o")                = 0
7143  open("tstamp.o", O_RDWR|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
7143  open("/tmp/ccqzaCI4.s", O_RDONLY|O_LARGEFILE) = 4
7143  close(4)                          = 0
7143  close(3)                          = 0
7143  exit_group(0)                     = ?
7141  <... waitpid resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 7143
7141  unlink("/tmp/ccqzaCI4.s")         = 0
7141  exit_group(0)                     = ?

cc1 has all the applicable logic. I imagine that is a complex program, especially after typing:

/usr/libexec/gcc/i686-redhat-linux/4.6.1/cc1 --help

and

/usr/libexec/gcc/i686-redhat-linux/4.6.1/cc1 --help=C


来源:https://stackoverflow.com/questions/9117287/easiest-way-to-work-with-intermediate-format

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!