Vectorizing operation on numpy array

问题

I have a numpy array containing many three-dimensional numpy arrays, where each of these sub-elements is a grayscale image. I want to use numpy's vectorize to apply an affine transformation to each image in the array.

Here is a minimal example that reproduces the issue:

import cv2
import numpy as np
from functools import partial

# create four blank images
data = np.zeros((4, 1, 96, 96), dtype=np.uint8)

M = np.array([[1, 0, 0], [0, 1, 0]], dtype=np.float32) # dummy affine transformation matrix
size = (96, 96) # output image size

Now I want to pass each of the images in data to cv2.warpAffine(src, M, dsize). Before I vectorize it, I first create a partial function that binds M and dsize:

warpAffine = lambda M, size, img : cv2.warpAffine(img, M, size) # re-order function parameters
partialWarpAffine = partial(warpAffine, M, size)

vectorizedWarpAffine = np.vectorize(partialWarpAffine)
print data[:, 0].shape # prints (4, 96, 96)
vectorizedWarpAffine(data[:, 0])

But this outputs:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/dist-packages/numpy/lib/function_base.py", line 1573, in __call__
    return self._vectorize_call(func=func, args=vargs)
  File "/usr/lib/python2.7/dist-packages/numpy/lib/function_base.py", line 1633, in _vectorize_call
    ufunc, otypes = self._get_ufunc_and_otypes(func=func, args=args)
  File "/usr/lib/python2.7/dist-packages/numpy/lib/function_base.py", line 1597, in _get_ufunc_and_otypes
    outputs = func(*inputs)
  File "<stdin>", line 1, in <lambda>
TypeError: src is not a numpy array, neither a scalar

What am I doing wrong - why can't I vectorize an operation on numpy arrays?

回答1:

The problem is that just by using partial it doesn't make the existence of the other arguments go away for the sake of vectorize. The function underlying the partial object will be vectorizedWarpAffine.pyfunc, which will keep track of whatever pre-bound arguments you'd like it to use when calling vectorizedWarpAffine.pyfunc.func (which is still a multi-argumented function).

You can see it like this (after you import inspect):

In [19]: inspect.getargspec(vectorizedWarpAffine.pyfunc.func)
Out[19]: ArgSpec(args=['M', 'size', 'img'], varargs=None, keywords=None, defaults=None)

To get around this, you can use the excluded option to np.vectorize which says which arguments (positonal or keyword) to ignore when wrapping the vectorization behavior:

vectorizedWarpAffine = np.vectorize(partialWarpAffine, 
                                    excluded=set((0, 1)))

When I make this change, the code appears to actually execute the vectorized function now, but it hits an actual error in the imagewarp.cpp code, presumably due to some bad data assumption on this test data:

In [21]: vectorizedWarpAffine(data[:, 0])
OpenCV Error: Assertion failed (cn <= 4 && ssize.area() > 0) in remapBilinear, file -------src-dir-------/opencv-2.4.6.1/modules/imgproc/src/imgwarp.cpp, line 2296
---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
<ipython-input-21-3fb586393b75> in <module>()
----> 1 vectorizedWarpAffine(data[:, 0])

/home/ely/anaconda/lib/python2.7/site-packages/numpy/lib/function_base.pyc in __call__(self, *args, **kwargs)
   1570             vargs.extend([kwargs[_n] for _n in names])
   1571 
-> 1572         return self._vectorize_call(func=func, args=vargs)
   1573 
   1574     def _get_ufunc_and_otypes(self, func, args):

/home/ely/anaconda/lib/python2.7/site-packages/numpy/lib/function_base.pyc in _vectorize_call(self, func, args)
   1628         """Vectorized call to `func` over positional `args`."""
   1629         if not args:
-> 1630             _res = func()
   1631         else:
   1632             ufunc, otypes = self._get_ufunc_and_otypes(func=func, args=args)

/home/ely/anaconda/lib/python2.7/site-packages/numpy/lib/function_base.pyc in func(*vargs)
   1565                     the_args[_i] = vargs[_n]
   1566                 kwargs.update(zip(names, vargs[len(inds):]))
-> 1567                 return self.pyfunc(*the_args, **kwargs)
   1568 
   1569             vargs = [args[_i] for _i in inds]

/home/ely/programming/np_vect.py in <lambda>(M, size, img)
     10 size = (96, 96) # output image size
     11 
---> 12 warpAffine = lambda M, size, img : cv2.warpAffine(img, M, size) # re-order function parameters
     13 partialWarpAffine = partial(warpAffine, M, size)
     14 

error: -------src-dir-------/opencv-2.4.6.1/modules/imgproc/src/imgwarp.cpp:2296: error: (-215) cn <= 4 && ssize.area() > 0 in function remapBilinear

As a side note: I am seeing a shape of (4, 96, 96) for your data, not (4, 10, 10).

Also note that using np.vectorize is not a technique for improving the performance of a function. All it does is gently wrap your function call inside a superficial for-loop (albeit at the NumPy level). It is a technique for writing functions that automatically adhere to NumPy broadcasting rules and for making your API superficially similar to NumPy's API, whereby function calls are expected to work correctly on top of ndarray arguments.

See this post for more details.

Added: The main reason you are using partial in this case is to get a new function that's ostensibly "single-argumented" but that doesn't work out as planned based on the way partial works. So why not just get rid of partial all together?

You can leave your lambda function exactly as it is, even with the two non-array positional arguments, but still ensure that the third argument is treated as something to vectorize over. To do this, you just use excluded as above, but you also need to tell vectorize what to expect as the output.

The reason for this is that vectorize will try to determine what the output shape is supposed to be by running your function on the first element of the data you supply. In this case (and I am not fully sure and it would be worth more debugging) this seems to create the "src is not numpy array" error you were seeing.

So to prevent vectorize from even trying it, you can provide a list of the output types yourself, like this:

vectorizedWarpAffine = np.vectorize(warpAffine, 
                                    excluded=(0, 1), 
                                    otypes=[np.ndarray])

and it works:

In [29]: vectorizedWarpAffine(M, size, data[:, 0])
Out[29]: 
array([[[ array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       ..., 
       ...

I think this is a lot nicer because now when you call vectorizedWarpAffine you still explicitly utilize the other positional arguments, instead of the layer of misdirection where they are pre-bound with partial, and yet the third argument is still treated vectorially.

来源：https://stackoverflow.com/questions/28521429/vectorizing-operation-on-numpy-array

标签

python

OpenCV

numpy

vectorization