numba | 易学教程

Huge errors trying numba

阅读更多关于 Huge errors trying numba

I'm running into a big load of errors using numba. Ironically, the correct result is printed after the errors. I'm using the newest Anaconda python and installed numba with conda install numba once on Ubuntu 13, 64 bit and also anaconda 64 bit and on windows 64 bit with a 32 bit version of anaconda. The script I'm trying to execute is: # -*- coding: utf-8 -*- import math from numba import autojit pi = math.pi @autojit def sinc(x): if x == 0.0: return 1.0 else: return math.sin(x*pi)/(pi*x) if __name__ == '__main__': a = 4.5 print sinc(a) and the errors I get are: DEBUG -- translate:361

Filtering a NumPy Array: what is the best approach?

阅读更多关于 Filtering a NumPy Array: what is the best approach?

Suppose I have a NumPy array arr that I want to element-wise filter, e.g. I want to get only values below a certain threshold value k . There are a couple of methods, e.g.: Using generators: np.fromiter((x for x in arr if x < k), dtype=arr.dtype) Using boolean mask slicing: arr[arr < k] Using np.where() : arr[np.where(arr < k)] Using np.nonzero() : arr[np.nonzero(arr < k)] Using a Cython-based custom implementation(s) Using a Numba-based custom implementation(s) Which is the fastest? What about memory efficiency? (EDITED: Added np.nonzero() based on @ShadowRanger comment) Definitions Using

Numba autojit error on comparing numpy arrays

阅读更多关于 Numba autojit error on comparing numpy arrays

问题 When I compare two numpy arrays inside my function I get an error saying only length-1 arrays can be converted to Python scalars: from numpy.random import rand from numba import autojit @autojit def myFun(): a = rand(10,1) b = rand(10,1) idx = a > b return idx myFun() The error: --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-7-f7b68c0872a3> in <module>() ----> 1 myFun() /Users/Guest/Library/Enthought

cProfile adds significant overhead when calling numba jit functions

阅读更多关于 cProfile adds significant overhead when calling numba jit functions

问题 Compare a pure Python no-op function with a no-op function decorated with @numba.jit , that is: import numba @numba.njit def boring_numba(): pass def call_numba(x): for t in range(x): boring_numba() def boring_normal(): pass def call_normal(x): for t in range(x): boring_normal() If we time this with %timeit , we get the following: %timeit call_numba(int(1e7)) 792 ms ± 5.51 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %timeit call_normal(int(1e7)) 737 ms ± 2.7 ms per loop (mean ± std.

Need help vectorizing code or optimizing

阅读更多关于 Need help vectorizing code or optimizing

I am trying to do a double integral by first interpolating the data to make a surface. I am using numba to try and speed this process up, but it's just taking too long. Here is my code, with the images needed to run the code located at here and here . Noting that your code has a quadruple-nested set of for loops, I focused on optimizing the inner pair. Here's the old code: for i in xrange(K.shape[0]): for j in xrange(K.shape[1]): print(i,j) '''create an r vector ''' r=(i*distX,j*distY,z) for x in xrange(img.shape[0]): for y in xrange(img.shape[1]): '''create an ksi vector, then calculate it's

Coroutines in numba

阅读更多关于 Coroutines in numba

I'm working on something that requires fast coroutines and I believe numba could speed up my code. Here's a silly example: a function that squares its input, and adds to it the number of times its been called. def make_square_plus_count(): i = 0 def square_plus_count(x): nonlocal i i += 1 return x**2 + i return square_plus_count You can't even nopython=False JIT this, presumably due to the nonlocal keyword. But you don't need nonlocal if you use a class instead: def make_square_plus_count(): @numba.jitclass({'i': numba.uint64}) class State: def __init__(self): self.i = 0 state = State() @numba

How can you implement a C callable from Numba for efficient integration with nquad?

阅读更多关于 How can you implement a C callable from Numba for efficient integration with nquad?

I need to do a numerical integration in 6D in python. Because the scipy.integrate.nquad function is slow I am currently trying to speed things up by defining the integrand as a scipy.LowLevelCallable with Numba. I was able to do this in 1D with the scipy.integrate.quad by replicating the example given here : import numpy as np from numba import cfunc from scipy import integrate def integrand(t): return np.exp(-t) / t**2 nb_integrand = cfunc("float64(float64)")(integrand) # regular integration %timeit integrate.quad(integrand, 1, np.inf) 10000 loops, best of 3: 128 µs per loop # integration

How to determine if numba's prange actually works correctly?

阅读更多关于 How to determine if numba's prange actually works correctly?

In another Q+A ( Can I perform dynamic cumsum of rows in pandas? ) I made a comment regarding the correctness of using prange about this code (of this answer ): from numba import njit, prange @njit def dynamic_cumsum(seq, index, max_value): cumsum = [] running = 0 for i in prange(len(seq)): if running > max_value: cumsum.append([index[i], running]) running = 0 running += seq[i] cumsum.append([index[-1], running]) return cumsum The comment was: I wouldn't recommend parallelizing a loop that isn't pure. In this case the running variable makes it impure. There are 4 possible outcomes: (1)numba

Optimize Double loop in python

阅读更多关于 Optimize Double loop in python

I am trying to optimize the following loop : def numpy(nx, nz, c, rho): for ix in range(2, nx-3): for iz in range(2, nz-3): a[ix, iz] = sum(c*rho[ix-1:ix+3, iz]) b[ix, iz] = sum(c*rho[ix-2:ix+2, iz]) return a, b I tried different solutions and found using numba to calculate the sum of the product leads to better performances: import numpy as np import numba as nb import time @nb.autojit def sum_opt(arr1, arr2): s = arr1[0]*arr2[0] for i in range(1, len(arr1)): s+=arr1[i]*arr2[i] return s def numba1(nx, nz, c, rho): for ix in range(2, nx-3): for iz in range(2, nz-3): a[ix, iz] = sum_opt(c, rho

Negative Speed Gain Using Numba Vectorize target='cuda'

阅读更多关于 Negative Speed Gain Using Numba Vectorize target='cuda'

I am trying to test out the effectiveness of using the Python Numba module's @vectorize decorator for speeding up a code snippet relevant to my actual code. I'm utilizing a code snippet provided in CUDAcast #10 available here and shown below: import numpy as np from timeit import default_timer as timer from numba import vectorize @vectorize(["float32(float32, float32)"], target='cpu') def VectorAdd(a,b): return a + b def main(): N = 32000000 A = np.ones(N, dtype=np.float32) B = np.ones(N, dtype=np.float32) C = np.zeros(N, dtype=np.float32) start = timer() C = VectorAdd(A, B) vectoradd_time =