numba | 易学教程

Can I perform dynamic cumsum of rows in pandas?

阅读更多关于 Can I perform dynamic cumsum of rows in pandas?

If I have the following dataframe, derived like so: df = pd.DataFrame(np.random.randint(0, 10, size=(10, 1))) 0 0 0 1 2 2 8 3 1 4 0 5 0 6 7 7 0 8 2 9 2 Is there an efficient way cumsum rows with a limit and each time this limit is reached, to start a new cumsum . After each limit is reached (however many rows), a row is created with the total cumsum. Below I have created an example of a function that does this, but it's very slow, especially when the dataframe becomes very large. I don't like that my function is looping and I am looking for a way to make it faster (I guess a way without a loop

Comparing Python, Numpy, Numba and C++ for matrix multiplication

阅读更多关于 Comparing Python, Numpy, Numba and C++ for matrix multiplication

问题 In a program I am working on, I need to multiply two matrices repeatedly. Because of the size of one of the matrices, this operation takes some time and I wanted to see which method would be the most efficient. The matrices have dimensions (m x n)*(n x p) where m = n = 3 and 10^5 < p < 10^6 . With the exception of Numpy, which I assume works with an optimized algorithm, every test consists of a simple implementation of the matrix multiplication: Below are my various implementations: Python

numba guvectorize target='parallel' slower than target='cpu'

阅读更多关于 numba guvectorize target='parallel' slower than target='cpu'

问题 I've been attempting to optimize a piece of python code that involves large multi-dimensional array calculations. I am getting counterintuitive results with numba. I am running on an MBP, mid 2015, 2.5 GHz i7 quadcore, OS 10.10.5, python 2.7.11. Consider the following: import numpy as np from numba import jit, vectorize, guvectorize import numexpr as ne import timeit def add_two_2ds_naive(A,B,res): for i in range(A.shape[0]): for j in range(B.shape[1]): res[i,j] = A[i,j]+B[i,j] @jit def add

why can't I get the right sum of 1D array with numba (cuda python)?

阅读更多关于 why can't I get the right sum of 1D array with numba (cuda python)?

I try to use cuda python with numba. The code is to calculate the sum of a 1D array as follows, but I don't know how to get one value result rather than three values. python3.5 with numba + CUDA8.0 import os,sys,time import pandas as pd import numpy as np from numba import cuda, float32 os.environ['NUMBAPRO_NVVM']=r'D:\NVIDIA GPU Computing Toolkit\CUDA\v8.0\nvvm\bin\nvvm64_31_0.dll' os.environ['NUMBAPRO_LIBDEVICE']=r'D:\NVIDIA GPU Computing Toolkit\CUDA\v8.0\nvvm\libdevice' bpg = (1,1) tpb = (1,3) @cuda.jit def calcu_sum(D,T): ty = cuda.threadIdx.y bh = cuda.blockDim.y index_i = ty L = len(D)

How to parallelize this Python for loop when using Numba

阅读更多关于 How to parallelize this Python for loop when using Numba

问题 I'm using the Anaconda distribution of Python, together with Numba, and I've written the following Python function that multiplies a sparse matrix A (stored in a CSR format) by a dense vector x : @jit def csrMult( x, Adata, Aindices, Aindptr, Ashape ): numRowsA = Ashape[0] Ax = numpy.zeros( numRowsA ) for i in range( numRowsA ): Ax_i = 0.0 for dataIdx in range( Aindptr[i], Aindptr[i+1] ): j = Aindices[dataIdx] Ax_i += Adata[dataIdx] * x[j] Ax[i] = Ax_i return Ax Here A is a large scipy sparse

Getting python Numba working on Ubuntu 14.10 or Fedora 21 with python 2.7

阅读更多关于 Getting python Numba working on Ubuntu 14.10 or Fedora 21 with python 2.7

问题 Recently, I have had a frustrating time to get python Numba working on Ubuntu or Fedora Linux. The main problem has been with the compilation of llvmlite. What do I need to install for these to compile properly? 回答1: The versions I got working at the end were numba-0.17.0 (also 0.18.2) and llvmlite-0.2.2 (also 0.4.0). Here are the relevant dependencies and configuration options on Ubuntu and Fedora. For Ubuntu 14.04 *Trusty) sudo apt-get install zlib1g zlib1g-dev libedit libedit-dev llvm-3.8

Achieving Numba's performance with Cython

阅读更多关于 Achieving Numba's performance with Cython

问题 Usually I'm able to match Numba's performance when using Cython. However, in this example I have failed to do so - Numba is about 4 times faster than my Cython's version. Here the Cython-version: %%cython -c=-march=native -c=-O3 cimport numpy as np import numpy as np cimport cython @cython.boundscheck(False) @cython.wraparound(False) def cy_where(double[::1] df): cdef int i cdef int n = len(df) cdef np.ndarray[dtype=double] output = np.empty(n, dtype=np.float64) for i in range(n): if df[i]>0

why can't I get the right sum of 1D array with numba (cuda python)?

阅读更多关于 why can't I get the right sum of 1D array with numba (cuda python)?

问题 I try to use cuda python with numba. The code is to calculate the sum of a 1D array as follows, but I don't know how to get one value result rather than three values. python3.5 with numba + CUDA8.0 import os,sys,time import pandas as pd import numpy as np from numba import cuda, float32 os.environ['NUMBAPRO_NVVM']=r'D:\NVIDIA GPU Computing Toolkit\CUDA\v8.0\nvvm\bin\nvvm64_31_0.dll' os.environ['NUMBAPRO_LIBDEVICE']=r'D:\NVIDIA GPU Computing Toolkit\CUDA\v8.0\nvvm\libdevice' bpg = (1,1) tpb =

Matrix inversion without Numpy

阅读更多关于 Matrix inversion without Numpy

问题 I want to invert a matrix without using numpy.linalg.inv . The reason is that I am using Numba to speed up the code, but numpy.linalg.inv is not supported, so I am wondering if I can invert a matrix with 'classic' Python code. With numpy.linalg.inv an example code would look like that: import numpy as np M = np.array([[1,0,0],[0,1,0],[0,0,1]]) Minv = np.linalg.inv(M) 回答1: Here is a more elegant and scalable solution, imo. It'll work for any nxn matrix and you may find use for the other

Python numpy: cannot convert datetime64[ns] to datetime64[D] (to use with Numba)

阅读更多关于 Python numpy: cannot convert datetime64[ns] to datetime64[D] (to use with Numba)

问题 I want to pass a datetime array to a Numba function (which cannot be vectorised and would otherwise be very slow). I understand Numba supports numpy.datetime64. However, it seems it supports datetime64[D] (day precision) but not datetime64[ns] (millisecond precision) (I learnt this the hard way: is it documented?). I tried to convert from datetime64[ns] to datetime64[D], but can't seem to find a way! Any ideas? I have summarised my problem with the minimal code below. If you run testdf