Element-wise string concatenation in numpy

后端 未结 5 787
萌比男神i
萌比男神i 2020-11-30 04:04

Is this a bug?

import numpy as np
a1=np.array([\'a\',\'b\'])
a2=np.array([\'E\',\'F\'])

In [20]: add(a1,a2)
Out[20]: NotImplemented

I am t

相关标签:
5条回答
  • 2020-11-30 04:17

    This can be done using numpy.core.defchararray.add. Here is an example:

    >>> import numpy as np
    >>> a1 = np.array(['a', 'b'])
    >>> a2 = np.array(['E', 'F'])
    >>> np.core.defchararray.add(a1, a2)
    array(['aE', 'bF'], 
          dtype='<U2')
    

    There are other useful string operations available for NumPy data types.

    0 讨论(0)
  • 2020-11-30 04:18

    One more basic, elegant and fast solution:

    In [11]: np.array([x1 + x2 for x1,x2 in zip(a1,a2)])
    Out[11]: array(['aE', 'bF'], dtype='<U2')
    

    It is very fast for smaller arrays.

    In [12]: %timeit np.array([x1 + x2 for x1,x2 in zip(a1,a2)])
    3.67 µs ± 136 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    
    In [13]: %timeit np.core.defchararray.add(a1, a2)
    6.27 µs ± 28.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    
    In [14]: %timeit np.char.array(a1) + np.char.array(a2)
    22.1 µs ± 319 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    

    For larger arrays, time difference is not much.

    In [15]: b1 = np.full(10000,'a')    
    In [16]: b2 = np.full(10000,'b')    
    
    In [189]: %timeit np.array([x1 + x2 for x1,x2 in zip(b1,b2)])
    6.74 ms ± 66.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    
    In [188]: %timeit np.core.defchararray.add(b1, b2)
    7.03 ms ± 419 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    
    In [187]: %timeit np.char.array(b1) + np.char.array(b2)
    6.97 ms ± 284 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    
    0 讨论(0)
  • 2020-11-30 04:20

    This can (and should) be done in pure Python, as numpy also uses the Python string manipulation functions internally:

    >>> a1 = ['a','b']
    >>> a2 = ['E','F']
    >>> map(''.join, zip(a1, a2))
    ['aE', 'bF']
    
    0 讨论(0)
  • 2020-11-30 04:22

    Another solution is to convert string arrays into arrays of python of objects so that str.add is called:

    >>> import numpy as np
    >>> a = np.array(['a', 'b', 'c', 'd'], dtype=np.object)   
    >>> print a+a
    array(['aa', 'bb', 'cc', 'dd'], dtype=object)
    

    This is not that slow (less than twice as slow as adding integer arrays).

    0 讨论(0)
  • 2020-11-30 04:32

    You can use the chararray subclass to perform array operations with strings:

    a1 = np.char.array(['a', 'b'])
    a2 = np.char.array(['E', 'F'])
    
    a1 + a2
    #chararray(['aE', 'bF'], dtype='|S2')
    

    another nice example:

    b = np.array([2, 4])
    a1*b
    #chararray(['aa', 'bbbb'], dtype='|S4')
    
    0 讨论(0)
提交回复
热议问题