Passing reference to STL vector over dll boundary

前端 未结 7 1862
一个人的身影
一个人的身影 2020-12-01 08:02

I have a nice library for managing files that needs to return specific lists of strings. Since the only code I\'m ever going to use it with is going to be C++ (and Java but

相关标签:
7条回答
  • 2020-12-01 08:44

    You are probably running into binary compatibility issues. On windows, if you want to use C++ interfaces between DLLs you have to make sure that a lot of things are in order, for ex.

    • All DLLs involved must be built with the same version of the visual studio compiler
    • All DLLs have to have link the same version of the C++ runtime (in most versions of VS this is the Runtime Library setting under Configuration -> C++ -> Code Generation in the project properties)
    • Iterator debugging settings have to be the same for all builds (this is part of the reason you can't mix Release and Debug DLL's)

    That's not an exhaustive list by any stretch unfortunately :(

    0 讨论(0)
  • 2020-12-01 08:47

    The vector there uses the default std::allocator, which uses ::operator new for its allocation.

    The problem is, when the vector is used in the DLL's context, it is compiled with that DLL's vector code, which knows about the ::operator new provided by that DLL.

    The code in the EXE will try to use the EXE's ::operator new.

    I bet the reason this works on Mac/Linux and not on Windows is because Windows requires all symbols be resolved at compile time.

    For example, you may have seen Visual Studio give an error saying something like "Unresolved external symbol." It means "You told me this function named foo() exists, but I can't find it anywhere."

    This is not the same as what Mac/Linux does. It requires all symbols be resolved at load time. What this means is you can compile a .so with a missing ::operator new. And your program can load in your .so and provide its ::operator new to the .so, allowing it to be resolved. By default, all symbols are exported in GCC, and so ::operator new will be exported by the program and potentially loaded in by your .so.

    There is an interesting thing here, where Mac/Linux allows circular dependencies. The program could rely on a symbol that is provided by the .so, and that same .so might rely on a symbol provided by the program. Circular dependencies are a terrible thing and so I really like that the Windows method forces you to not do this.

    But, that said, the real problem is that you are trying to use C++ objects across boundaries. That is definitely a mistake. It will ONLY work if the compiler used in the DLL and the EXE is the same, with the same settings. The 'extern "C"' may attempt to prevent name mangling (not sure what it does for non-C-types like std::vector). But it doesn't change the fact that the other side may have a totally different implementation of std::vector.

    Generally speaking, if it is passed across boundaries like that, you want it to be in a plain old C type. If it is things like ints and simple types, things aren't so difficult. In your case, you probably want to pass an array of char*. Which means you still need to be careful about memory management.

    The DLL/.so should manage its own memory. So the function might be like this:

    Foo *bar = nullptr;
    int barCount = 0;
    getFoos( bar, &barCount );
    // use your foos
    releaseFoos(bar);
    

    The drawback is that you will have extra code to convert things to C-sharable types at the boundaries. And sometimes this leaks into your implementation in order to speed up the implementation.

    But the benefit is now people can use any language and any compiler version and any settings to write a DLL for you. And you are more careful about proper memory management and dependencies.

    I know it is extra work. But that is the proper way to do things across boundaries.

    0 讨论(0)
  • 2020-12-01 08:55

    Everybody seems to be hung up on the infamous DLL-compiler-incompatibility issue here, but I think you are right about this being related to the heap allocations. I suspect what is happening is that the vector (allocated in main exe's heap space) contains strings allocated in the DLL's heap space. When the vector goes out of scope and is deallocated, it's also attempting to deallocate the strings - and all this is happening on the .exe side, which causes the crash.

    I have two instinctive suggestions:

    1. Wrap each string in a std::unique_ptr. It includes a 'deleter' which handles the deallocation of its contents when the unique_ptr goes out of scope. When the unique_ptr is created on the DLL side, so is its deleter. So when the vector goes out of scope and the destructors of its contents are called, the strings will be deallocated by their DLL-bound deleters and no heap conflict occurs.

      extern "C" FILE_MANAGER_EXPORT void get_all_files(vector<unique_ptr<string>>& files)
      {
          files.clear();
          for (vector<file_struct>::iterator i = file_structs.begin(); i != file_structs.end(); ++i)
          {
              files.push_back(unique_ptr<string>(new string(i->full_path)));
          }
      }
      
    2. Keep the vector on the DLL side and just return a reference to it. You can pass the reference across the DLL boundary:

      vector<string> files;
      
      extern "C" FILE_MANAGER_EXPORT vector<string>& get_all_files()
      {
          files.clear();
          for (vector<file_struct>::iterator i = file_structs.begin(); i != file_structs.end(); ++i)
          {
              files.push_back(i->full_path);
          }
          return files;
      }
      

    Semi-related: “Downcasting” unique_ptr<Base> to unique_ptr<Derived> (across DLL boundary):

    0 讨论(0)
  • 2020-12-01 08:57

    Your main problem is that passing C++ types across DLL boundaries is difficult. You need the following

    1. Same compiler
    2. Same standard library
    3. Same settings for exceptions
    4. In Visual C++ you need same version of the compiler
    5. In Visual C++ you need same Debug/Release configuration
    6. In Visual C++ you need same Iterator debug level

    And so on

    If that is what you want, I wrote a header-only library called cppcomponents https://github.com/jbandela/cppcomponents that provides the easiest way to do it in C++. You need a compiler with strong support for C++11. Gcc 4.7.2 or 4.8 will work. Visual C++ 2013 preview also works.

    I will walk you through using cppcomponents to solve your problem.

    1. git clone https://github.com/jbandela/cppcomponents.git in the directory of your choice. We will refer to the directory where you ran this command as localgit

    2. Create a file called interfaces.hpp. In this file you will define the interface that can be used across compilers.

    Enter the following

    #include <cppcomponents/cppcomponents.hpp>
    
    using cppcomponents::define_interface;
    using cppcomponents::use;
    using cppcomponents::runtime_class;
    using cppcomponents::use_runtime_class;
    using cppcomponents::implement_runtime_class;
    using cppcomponents::uuid;
    using cppcomponents::object_interfaces;
    
    struct IGetFiles:define_interface<uuid<0x633abf15,0x131e,0x4da8,0x933f,0xc13fbd0416cd>>{
    
        std::vector<std::string> GetFiles();
    
        CPPCOMPONENTS_CONSTRUCT(IGetFiles,GetFiles);
    
    
    };
    
    inline std::string FilesId(){return "Files!Files";}
    typedef runtime_class<FilesId,object_interfaces<IGetFiles>> Files_t;
    typedef use_runtime_class<Files_t> Files;
    

    Next create an implementation. To do this create Files.cpp.

    Add the following code

    #include "interfaces.h"
    
    
    struct ImplementFiles:implement_runtime_class<ImplementFiles,Files_t>{
      std::vector<std::string> GetFiles(){
        std::vector<std::string> ret = {"samplefile1.h", "samplefile2.cpp"};
        return ret;
    
      }
    
      ImplementFiles(){}
    
    
    };
    
    CPPCOMPONENTS_DEFINE_FACTORY();
    

    Finally here is the file to use the above. Create UseFiles.cpp

    Add the following code

    #include "interfaces.h"
    #include <iostream>
    
    int main(){
    
      Files f;
      auto vec_files = f.GetFiles();
      for(auto& name:vec_files){
          std::cout << name << "\n";
        }
    
    }
    

    Now you can compile. Just to show we are compatible across compilers, we will use cl the Visual C++ compiler to compile UseFiles.cpp into UseFiles.exe. We will use Mingw Gcc to compile Files.cpp into Files.dll

    cl /EHsc UseFiles.cpp /I localgit\cppcomponents

    where localgit is the directory in which you ran git clone as described above

    g++ -std=c++11 -shared -o Files.dll Files.cpp -I localgit\cppcomponents

    There is no link step. Just make sure Files.dll and UseFiles.exe are in the same directory.

    Now run the executable with UseFiles

    cppcomponents will also work on Linux. The main change is when you compile the exe, you need to add -ldl to the flag, and when you compile the .so file, you need to add -fPIC to the flags.

    If you have further questions, let me know.

    0 讨论(0)
  • 2020-12-01 08:59

    My - partial - solution has been to implement all default constructors in the dll frame, so explicitly add (impelement) copy, assignment operator and even move constructors, depending on your program. This will cause the correct ::new to be called (assuming you specify __declspec(dllexport)). Include destructor implementations as well for matching deletes. Do not include any implementation code in a (dll) header file. I still get warnings about using non dll-interfaced classes (with stl containers) as base for dll-interfaced classes, but it works. This is using VS2013 RC for native code, on, obviously, windows.

    0 讨论(0)
  • 2020-12-01 09:00

    The problem occurs because dynamic (shared) libraries in MS languages use a different heap than the main executable. Creating a string in the DLL or updating the vector that causes a reallocation will cause this issue.

    The simplest fix for THIS issue is to change the library to a static lib (not certain how one makes CMAKE do that) because then all the allocations will occur in the executable and on a single heap. Of course then you have all of the static library compatibility issues of MS C++ which make your library less attractive.

    The requirements at the top of John Bandela's response are all similar to those for the static library implementation.

    Another solution is to implement the interface in the header (thereby compiled in the application space) and have those methods call pure functions with a C interface provided in the DLL.

    0 讨论(0)
提交回复
热议问题