Python: can numba work with arrays of strings in nopython mode?

白昼怎懂夜的黑 提交于 2019-12-30 19:30:26

问题


I am using pandas 0.16.2, numpy 1.9.2 and numba 0.20.

Is there any way to get numba to support arrays of strings in nopython mode? Alternatively, could I somehow convert strings to numbers which numba would recognise?

I have to run certain loops on an array of strings (a column from a pandas dataframe); if I could use numba the code would be substantially faster.

I have come up with this minimal example to show what I mean:

import numpy as np
import numba

x=np.array(['some','text','this','is'])

@numba.jit(nopython=True)
def numba_str(txt):
    x=0
    for i in xrange(txt.size):
        if txt[i]=='text':
            x += 1
    return x

print numba_str(x)

The error I get is:

Failed at nopython (nopython frontend)
Undeclared ==([char x 4], str)

Thanks!


回答1:


Strings are not yet supported by Numba (as of version 20.0). Actually, "character sequences are supported, but no operations are available on them".

Indeed, a possible workaround is to interpret characters as numbers. For ASCII characters this is straightforward, see the Python ord and chr functions. However, already for your minimal example, you end with functions that are a lot less readable:

import numpy as np
import numba

x=np.array(['some','text','this','is'])

@numba.jit(nopython=True)
def numba_str(txt):
    x=0
    for i in xrange(txt.shape[0]):
        if (txt[i,0]==116 and  # 't'
            txt[i,1]==101 and  # 'e'
            txt[i,2]==120 and  # 'x'
            txt[i,3]==116):    # 't'
            x += 1
    return x

print numba_str(x.view(np.uint8).reshape(-1, x.itemsize))


来源:https://stackoverflow.com/questions/32056337/python-can-numba-work-with-arrays-of-strings-in-nopython-mode

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!