I have a nice library for managing files that needs to return specific lists of strings. Since the only code I\'m ever going to use it with is going to be C++ (and Java but
You are probably running into binary compatibility issues. On windows, if you want to use C++ interfaces between DLLs you have to make sure that a lot of things are in order, for ex.
That's not an exhaustive list by any stretch unfortunately :(
The vector there uses the default std::allocator, which uses ::operator new for its allocation.
The problem is, when the vector is used in the DLL's context, it is compiled with that DLL's vector code, which knows about the ::operator new provided by that DLL.
The code in the EXE will try to use the EXE's ::operator new.
I bet the reason this works on Mac/Linux and not on Windows is because Windows requires all symbols be resolved at compile time.
For example, you may have seen Visual Studio give an error saying something like "Unresolved external symbol." It means "You told me this function named foo() exists, but I can't find it anywhere."
This is not the same as what Mac/Linux does. It requires all symbols be resolved at load time. What this means is you can compile a .so with a missing ::operator new. And your program can load in your .so and provide its ::operator new to the .so, allowing it to be resolved. By default, all symbols are exported in GCC, and so ::operator new will be exported by the program and potentially loaded in by your .so.
There is an interesting thing here, where Mac/Linux allows circular dependencies. The program could rely on a symbol that is provided by the .so, and that same .so might rely on a symbol provided by the program. Circular dependencies are a terrible thing and so I really like that the Windows method forces you to not do this.
But, that said, the real problem is that you are trying to use C++ objects across boundaries. That is definitely a mistake. It will ONLY work if the compiler used in the DLL and the EXE is the same, with the same settings. The 'extern "C"' may attempt to prevent name mangling (not sure what it does for non-C-types like std::vector). But it doesn't change the fact that the other side may have a totally different implementation of std::vector.
Generally speaking, if it is passed across boundaries like that, you want it to be in a plain old C type. If it is things like ints and simple types, things aren't so difficult. In your case, you probably want to pass an array of char*. Which means you still need to be careful about memory management.
The DLL/.so should manage its own memory. So the function might be like this:
Foo *bar = nullptr;
int barCount = 0;
getFoos( bar, &barCount );
// use your foos
releaseFoos(bar);
The drawback is that you will have extra code to convert things to C-sharable types at the boundaries. And sometimes this leaks into your implementation in order to speed up the implementation.
But the benefit is now people can use any language and any compiler version and any settings to write a DLL for you. And you are more careful about proper memory management and dependencies.
I know it is extra work. But that is the proper way to do things across boundaries.
Everybody seems to be hung up on the infamous DLL-compiler-incompatibility issue here, but I think you are right about this being related to the heap allocations. I suspect what is happening is that the vector (allocated in main exe's heap space) contains strings allocated in the DLL's heap space. When the vector goes out of scope and is deallocated, it's also attempting to deallocate the strings - and all this is happening on the .exe side, which causes the crash.
I have two instinctive suggestions:
Wrap each string in a std::unique_ptr
. It includes a 'deleter' which handles the deallocation of its contents when the unique_ptr goes out of scope. When the unique_ptr is created on the DLL side, so is its deleter. So when the vector goes out of scope and the destructors of its contents are called, the strings will be deallocated by their DLL-bound deleters and no heap conflict occurs.
extern "C" FILE_MANAGER_EXPORT void get_all_files(vector<unique_ptr<string>>& files)
{
files.clear();
for (vector<file_struct>::iterator i = file_structs.begin(); i != file_structs.end(); ++i)
{
files.push_back(unique_ptr<string>(new string(i->full_path)));
}
}
Keep the vector on the DLL side and just return a reference to it. You can pass the reference across the DLL boundary:
vector<string> files;
extern "C" FILE_MANAGER_EXPORT vector<string>& get_all_files()
{
files.clear();
for (vector<file_struct>::iterator i = file_structs.begin(); i != file_structs.end(); ++i)
{
files.push_back(i->full_path);
}
return files;
}
Semi-related: “Downcasting” unique_ptr<Base> to unique_ptr<Derived> (across DLL boundary):
Your main problem is that passing C++ types across DLL boundaries is difficult. You need the following
And so on
If that is what you want, I wrote a header-only library called cppcomponents https://github.com/jbandela/cppcomponents that provides the easiest way to do it in C++. You need a compiler with strong support for C++11. Gcc 4.7.2 or 4.8 will work. Visual C++ 2013 preview also works.
I will walk you through using cppcomponents to solve your problem.
git clone https://github.com/jbandela/cppcomponents.git
in the directory of your choice. We will refer to the directory where you ran this command as localgit
Create a file called interfaces.hpp
. In this file you will define the interface that can be used across compilers.
Enter the following
#include <cppcomponents/cppcomponents.hpp>
using cppcomponents::define_interface;
using cppcomponents::use;
using cppcomponents::runtime_class;
using cppcomponents::use_runtime_class;
using cppcomponents::implement_runtime_class;
using cppcomponents::uuid;
using cppcomponents::object_interfaces;
struct IGetFiles:define_interface<uuid<0x633abf15,0x131e,0x4da8,0x933f,0xc13fbd0416cd>>{
std::vector<std::string> GetFiles();
CPPCOMPONENTS_CONSTRUCT(IGetFiles,GetFiles);
};
inline std::string FilesId(){return "Files!Files";}
typedef runtime_class<FilesId,object_interfaces<IGetFiles>> Files_t;
typedef use_runtime_class<Files_t> Files;
Next create an implementation. To do this create Files.cpp
.
Add the following code
#include "interfaces.h"
struct ImplementFiles:implement_runtime_class<ImplementFiles,Files_t>{
std::vector<std::string> GetFiles(){
std::vector<std::string> ret = {"samplefile1.h", "samplefile2.cpp"};
return ret;
}
ImplementFiles(){}
};
CPPCOMPONENTS_DEFINE_FACTORY();
Finally here is the file to use the above. Create UseFiles.cpp
Add the following code
#include "interfaces.h"
#include <iostream>
int main(){
Files f;
auto vec_files = f.GetFiles();
for(auto& name:vec_files){
std::cout << name << "\n";
}
}
Now you can compile. Just to show we are compatible across compilers, we will use cl
the Visual C++ compiler to compile UseFiles.cpp
into UseFiles.exe
. We will use Mingw Gcc to compile Files.cpp
into Files.dll
cl /EHsc UseFiles.cpp /I localgit\cppcomponents
where localgit
is the directory in which you ran git clone
as described above
g++ -std=c++11 -shared -o Files.dll Files.cpp -I localgit\cppcomponents
There is no link step. Just make sure Files.dll
and UseFiles.exe
are in the same directory.
Now run the executable with UseFiles
cppcomponents will also work on Linux. The main change is when you compile the exe, you need to add -ldl
to the flag, and when you compile the .so file, you need to add -fPIC
to the flags.
If you have further questions, let me know.
My - partial - solution has been to implement all default constructors in the dll frame, so explicitly add (impelement) copy, assignment operator and even move constructors, depending on your program. This will cause the correct ::new to be called (assuming you specify __declspec(dllexport)). Include destructor implementations as well for matching deletes. Do not include any implementation code in a (dll) header file. I still get warnings about using non dll-interfaced classes (with stl containers) as base for dll-interfaced classes, but it works. This is using VS2013 RC for native code, on, obviously, windows.
The problem occurs because dynamic (shared) libraries in MS languages use a different heap than the main executable. Creating a string in the DLL or updating the vector that causes a reallocation will cause this issue.
The simplest fix for THIS issue is to change the library to a static lib (not certain how one makes CMAKE do that) because then all the allocations will occur in the executable and on a single heap. Of course then you have all of the static library compatibility issues of MS C++ which make your library less attractive.
The requirements at the top of John Bandela's response are all similar to those for the static library implementation.
Another solution is to implement the interface in the header (thereby compiled in the application space) and have those methods call pure functions with a C interface provided in the DLL.