Mixing PIC and non-PIC objects in a shared library

喜你入骨 提交于 2019-12-03 14:25:21

Forgot I even wrote this question.

Some explanations are in order first:

  • Non-PIC code may be loaded by the OS into any position in memory in [most?] modern OSs. After everything is loaded, it goes through a phase that fixes up the text segment (where the executable stuff ends up) so it correctly addresses global variables; to pull this off, the text segment must be writable.
  • PIC executable data can be loaded once by the OS and shared across multiple users/processes. For the OS to do this, however, the text segment must be read-only -- which means no fix-ups. The code is compiled to use a Global Offset Table (GOT) so it can address globals relative to the GOT, alleviating the need for fix-ups.
  • If a shared object is built without PIC, although it is strongly encouraged it doesn't appear that it's strictly necessary; if the OS must fix-up the text segment then it's forced to load it into memory that's marked read-write ... which prevents sharing across processes/users.
  • If an executable binary is built /with/ PIC, I don't know what goes wrong under the hood but I've witnessed a few tools become unstable (mysterious crashes & the like).

The answers:

  • Mixing PIC/non-PIC, or using PIC in executables can cause hard to predict and track down instabilities. I don't have a technical explanation for why.
    • ... to include segfaults, bus errors, stack corruption, and probably more besides.
  • Non-PIC in shared objects is probably not going to cause any serious problems, though it can result in more RAM used if the library is used many times across processes and/or users.

update (4/17)

I've since discovered the cause of some of the crashes I had seen previously. To illustrate:

/*header.h*/
#include <map>
typedef std::map<std::string,std::string> StringMap;
StringMap asdf;

/*file1.cc*/
#include "header.h"

/*file2.cc*/
#include "header.h"

int main( int argc, char** argv ) {
  for( int ii = 0; ii < argc; ++ii ) {
    asdf[argv[ii]] = argv[ii];
  }

  return 0;
}

... then:

$ g++ file1.cc -shared -PIC -o libblah1.so
$ g++ file1.cc -shared -PIC -o libblah2.so
$ g++ file1.cc -shared -PIC -o libblah3.so
$ g++ file1.cc -shared -PIC -o libblah4.so
$ g++ file1.cc -shared -PIC -o libblah5.so

$ g++ -zmuldefs file2.cc -Wl,-{L,R}$(pwd) -lblah{1..5} -o fdsa
#     ^^^^^^^^^
#     This is the evil that made it possible
$ args=(this is the song that never ends);
$ eval ./fdsa $(for i in {1..100}; do echo -n ${args[*]}; done)

That particular example may not end up crashing, but it's basically the situation that had existed in that group's code. If it does crash it'll likely be in the destructor, usually a double-free error.

Many years previous they added -zmuldefs to their build to get rid of multiply defined symbol errors. The compiler emits code for running constructors/destructors on global objects. -zmuldefs forces them to live at the same location in memory but it still runs the constructors/destructors once for the exe and each library that included the offending header -- hence the double-free.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!