Check this page for a open source library distributed with Anaconda
https://www.anaconda.com/blog/developer-blog/open-sourcing-anaconda-accelerate/
" Today, we are releasing a two new Numba sub-projects called pyculib and pyculib_sorting, which contain the NVIDIA GPU library Python wrappers and sorting functions from Accelerate. These wrappers work with NumPy arrays and Numba GPU device arrays to provide access to accelerated functions from:
cuBLAS: Linear algebra
cuFFT: Fast Fourier Transform
cuSparse: Sparse matrix operations
cuRand: Random number generation (host functions only)
Sorting: Fast sorting algorithms ported from CUB and ModernGPU
Going forward, the Numba project will take stewardship of pyculib and pyculib_sorting, releasing updates as needed when new Numba releases come out. These projects are BSD-licensed, just like Numba "