I found the bottleneck in my python code, played around with psycho etc. Then decided to write a c/c++ extension for performance.
With the help of swig you almost do
There be dragons here. Don't swig, don't boost. For any complicated project the code you have to fill in yourself to make them work becomes unmanageable quickly. If it's a plain C API to your library (no classes), you can just use ctypes. It will be easy and painless, and you won't have to spend hours trawling through the documentation for these labyrinthine wrapper projects trying to find the one tiny note about the feature you need.