Intel Performance Primitives is very fast and very mature. Most of the functionality is low-level, ranging from linear filters, arithmetic operations, FFT, wavelets, geometric transforms (...), but it also contains a few high-level algorithms e.g. for inpainting or segmentation. It's extremely fast, and well documented. I would definitely recommend it for commercial development (not sure if there are open-source licenses).