Segfault on declaring a variable of type vector<shared_ptr<int>>

浪子不回头ぞ 提交于 2019-12-03 04:17:36
sbabbi

Given the point of crash, and the fact that preloading libpthread seems to fix it, I believe that the execution of the two cases diverges at locale_init.cc:315. Here is an extract of the code:

  void
  locale::_S_initialize()
  {
#ifdef __GTHREADS
    if (__gthread_active_p())
      __gthread_once(&_S_once, _S_initialize_once);
#endif
    if (!_S_classic)
      _S_initialize_once();
  }

__gthread_active_p() returns true if your program is linked against pthread, specifically it checks if pthread_key_create is available. On my system, this symbol is defined in "/usr/include/c++/7.2.0/x86_64-pc-linux-gnu/bits/gthr-default.h" as static inline, hence it is a potential source of ODR violation.

Notice that LD_PRELOAD=libpthread,so will always cause __gthread_active_p() to return true.

__gthread_once is another inlined symbol that should always forward to pthread_once.

It's hard to guess what's going on without debugging, but I suspect that you are hitting the true branch of __gthread_active_p() even when it shouldn't, and the program then crashes because there is no pthread_once to call.

EDIT: So I did some experiments, the only way I see to get a crash in std::locale::_S_initialize is if __gthread_active_p returns true, but pthread_once is not linked in.

libstdc++ does not link directly against pthread, but it imports half of pthread_xx as weak objects, which means they can be undefined and not cause a linker error.

Obviously linking pthread will make the crash disappear, but if I am right, the main issue is that your libstdc++ thinks that it is inside a multi-threaded executable even if we did not link pthread in.

Now, __gthread_active_p uses __pthread_key_create to decide if we have threads or no. This is defined in your executable as a weak object (can be nullptr and still be fine). I am 99% sure that the symbol is there because of shared_ptr (remove it and check nm again to be sure). So, somehow __pthread_key_create gets bound to a valid address, maybe because of that last -lpthread in your linker flags. You can verify this theory by putting a breakpoint at locale_init.cc:315 and checking which branch you take.

EDIT2:

Summary of the comments, the issue is only reproducible if we have all of the following:

  1. Use ld.gold instead of ld.bfd
  2. Use --as-needed
  3. Forcing a weak definition of __pthread_key_create, in this case via instantiation of std::shared_ptr.
  4. Not linking to pthread, or linking pthread after --as-needed.

To answer the questions in the comments:

Why does it use gold by default?

By default it uses /usr/bin/ld, which on most distro is a symlink to either /usr/bin/ld.bfd or /usr/bin/ld.gold. Such default can be manipulated using update-alternatives. I am not sure why in your case it is ld.gold, as far as I understand RHEL5 ships with ld.bfd as default.

And why does gold not add pthread.so dependency to the binary if it is needed?

Because the definition of what is needed is somehow shady. man ld says (emphasis mine):

--as-needed

--no-as-needed

This option affects ELF DT_NEEDED tags for dynamic libraries mentioned on the command line after the --as-needed option. Normally the linker will add a DT_NEEDED tag for each dynamic library mentioned on the command line, regardless of whether the library is actually needed or not. --as-needed causes a DT_NEEDED tag to only be emitted for a library that at that point in the link satisfies a non-weak undefined symbol reference from a regular object file or, if the library is not found in the DT_NEEDED lists of other needed libraries, a non-weak undefined symbol reference from another needed dynamic library. Object files or libraries appearing on the command line after the library in question do not affect whether the library is seen as needed. This is similar to the rules for extraction of object files from archives. --no-as-needed restores the default behaviour.

Now, according to this bug report, gold is honoring the "non weak undefined symbol" part, while ld.bfd sees weak symbols as needed. TBH I do not have a full understanding on this, and there is some discussion on that link as to whether this is to be considered a ld.gold bug, or a libstdc++ bug.

Why do I need to mention -pthread and -lpthread both? (-pthread is passed by default by our build system, and I've pass -lpthread to make it work with gold is used).

-pthread and -lpthread do different things (see pthread vs lpthread). It is my understanding that the former should imply the latter.

Regardless, you can probably pass -lpthread only once, but you need to do it before --as-needed, or use --no-as-needed after the last library and before -lpthread.

It is also worth mentioning that I was not able to reproduce this issue on my system (GCC 7.2), even using the gold linker. So I suspect that it has been fixed in a more recent version libstdc++, which might also explain why it does not segfault if you use the system standard library.

This is likely a problem caused by subtle mismatches between libstdc++ ABIs. GCC 4.9 is not the system compiler on Red Hat Enterprise Linux 5, so it's not quite clear what you are using there (DTS 3?).

The locale implementation is known to be quite sensitive to ABI mismatches. See this thread on the gcc-help list:

Your best bet is to figure out which bits of libstdc++ where linked where, and somehow achieve consistency (either by hiding symbols, or recompiling things so that they are compatible).

It may also be useful to investigate the hybrid linkage model used for libstdc++ in Red Hat's Developer Toolset (where newer bits are linked statically, but the bulk of the C++ standard library uses the existing system DSO), but the system libstdc++ in Red hat Enterprise Linux 5 might be too old for that if you need support for current language features.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!