问题
I need to dynamically open a shared library lib.so
if a specific condition is met at runtime. The library contains ~700 functions and I need to load all their symbols.
A simple solution is to define the function pointers to all symbols contained in lib.so
, load the library using dlopen
and finally get the addresses of all symbols using dlsym
. However, given the number of functions, the code implementing this solution is very cumbersome.
I was wondering if a more elegant and concise solution exists, maybe with an appropriate use of macros for defining the function pointers. Thanks!
回答1:
You could automatically generate trampoline functions for all symbols in dlopen
-ed library. Trampolines would be seen as normal functions in application but would internally redirect to real code in library. Here's a simple 5-minute PoC:
$ cat lib.h
// Dynamic library header
#ifndef LIB_H
#define LIB_H
extern void foo(int);
extern void bar(int);
extern void baz(int);
#endif
$ cat lib.c
// Dynamic library implementation
#include <stdio.h>
void foo(int x) {
printf("Called library foo: %d\n", x);
}
void bar(int x) {
printf("Called library baz: %d\n", x);
}
void baz(int x) {
printf("Called library baz: %d\n", x);
}
$ cat main.c
// Main application
#include <dlfcn.h>
#include <stdio.h>
#include <lib.h>
// Should be autogenerated
void *fptrs[100];
void init_trampoline_table(void *h) {
fptrs[0] = dlsym(h, "foo");
fptrs[1] = dlsym(h, "bar");
fptrs[2] = dlsym(h, "baz");
}
int main() {
void *h = dlopen("./lib.so", RTLD_LAZY);
init_trampoline_table(h);
printf("Calling wrappers\n");
foo(123);
bar(456);
baz(789);
printf("Returned from wrappers\n");
return 0;
}
$ cat trampolines.S
// Trampoline code.
// Should be autogenerated. Each wrapper gets its own index in table.
// TODO: abort if table wasn't initialized.
.text
.globl foo
foo:
jmp *fptrs
.globl bar
bar:
jmp *fptrs+8
.globl baz
baz:
jmp *fptrs+16
$ gcc -fPIC -shared -O2 lib.c -o lib.so
$ gcc -I. -O2 main.c trampolines.S -ldl
$ ./a.out
Calling wrappers
Called library foo: 123
Called library baz: 456
Called library baz: 789
Returned from wrappers
Note that application code in main.c
is using only local functions (which wrap library functions) and does not have to mess with function pointers at all (apart from initializing redirection table at startup which should be autogenerated code anyway).
EDIT: I've created a standalone tool Implib.so to automate creation of stub libraries like in example above. This turned out to be more or less equivalent to well known Windows DLL import libraries.
回答2:
I need to dynamically open a shared library lib.so if a specific condition is met at runtime. The library contains ~700 functions and I need to load all their symbols.
role of dlopen
and dlsym
When you dlopen
a library, all the functions defined by that library becomes available in your virtual address space (because all the code segment of that library is added into your virtual address space by dlopen calling mmap(2) several times). So dlsym
don't add (or load) any additional code, it is already there. If your program is running in the process of pid 1234, try cat /proc/1234/maps
after the successful dlopen
.
What dlsym
provides is the ability to get the address of something in that shared library from its name, using some dynamic symbol table in that ELF plugin. If you don't need that, you don't need to call dlsym
.
Perhaps you could simply have, in your shared library, a large array of all the relevant functions (available as a global variable in your shared library). Then you'll just need to call dlsym
once, for the name of that global variable.
BTW, a constructor (constructor
is a function attribute) function of your plugin could instead "register" some functions of that plugin (into some global data structure of your main program; this is how Ocaml dynamic linking works); so it even makes sense to never call dlsym
and still be able to use the functions of your plugin.
For a plugin, its constructor functions are called at dlopen
time (before the dlopen
returns!) and its destructor functions are called at dlclose
time (before dlclose
returns).
repeating calls to dlsym
It is common practice to use dlsym
many times. Your main program would declare several variables (or other data e.g. fields in some struct
, array components, etc...) and fill these with dlsym
. Calling dlsym
a few hundred times is really quick. For example you could declare some global variables
void*p_func_a;
void*p_func_b;
(you'll often declare these as pointers to functions of appropriate, and perhaps different, types; perhaps use typedef to declare signatures)
and you'll load your plugin with
void*plh = dlopen("/usr/lib/myapp/myplugin.so", RTLD_NOW);
if (!plh) { fprintf(stderr, "dlopen failure %s\n", dlerror());
exit(EXIT_FAILURE); };
then you'll fetch function pointers with
p_func_a = dlsym(plh, "func_a");
if (!p_func_a) { fprintf(stderr, "dlsym func_a failure %s\n", dlerror());
exit(EXIT_FAILURE); };
p_func_b = dlsym(plh, "func_b");
if (!p_func_b) { fprintf(stderr, "dlsym func_b failure %s\n", dlerror());
exit(EXIT_FAILURE); };
(of course you could use preprocessor macros to shorten such repetitive code; X-macro tricks are handy.)
Don't be shy in calling dlsym
hundreds of times. It is however important to define and document appropriate conventions regarding your plugin (e.g. explain that every plugin should define func_a
and func_b
and when are they called by your main program (using p_func_a
etc... there). If your conventions require hundreds of different names, it is a bad smell.
agglomerating plugin functions into a data structure
So assume your library defines func_a
, func_b
, func_c1
, ... func_c99
etc etc you might have a global array (POSIX allows casting functions into void*
but the C11 standard does not allow that):
const void* globalarray[] = {
(void*)func_a,
(void*)func_b,
(void*)func_c1,
/// etc
(void*)func_c99,
/// etc
NULL /* final sentinel value */
};
and then you'll need to dlsym
only one symbol: globalarray
; I don't know if you need or want that. Of course you could use more fancy data structures (e.g. mimicking vtables or operation tables).
using a constructor function in your plugin
With the constructor approach, and supposing your main program provides some register_plugin_function
which do appropriate things (e.g. put the pointer in some global hash table, etc...), we would have in the plugin code a function declared as
static void my_plugin_starter(void) __attribute__((constructor));
void my_plugin_starter(void) {
register_plugin_function ("func", 0, (void*)func_a);
register_plugin_function ("func", 1, (void*)func_b);
/// etc...
register_plugin_function ("func", -1, (void*)func_c1);
/// etc...
};
and with such a constructor the func_a
etc... could be static
or with restricted visibility. We then don't need any call to dlsym
from the main program (which should provide the register_plugin_function
function) loading the plugin.
references
Read more carefully dynamic loading and plug-ins and linker wikipages. Read Levine's Linkers and Loaders book. Read elf(5), proc(5), ld-linux(8), dlopen(3), dlsym(3), dladdr(3). Play with objdump(1), nm(1), readelf(1).
Of course read Drepper's How To Write Shared Libraries paper.
BTW, you can call dlopen
then dlsym
a big lot of times. My manydl.c program is generating "random" C code, compiling it as a plugin, then dlopen
-ing and dlsym
-ing it, and repeats. It demonstrates that (with patience) you can have millions of plugins dlopen
-ed in the same process, and you can call dlsym
a lot of times.
来源:https://stackoverflow.com/questions/45917816/is-there-an-elegant-way-to-avoid-dlsym-when-using-dlopen-in-c