Improve performance of converting numpy array to MATLAB double

后端未结

关注

 3  960

Calling MATLAB from Python is bound to give some performance reduction that I could avoid by rewriting (a lot of) code in Python. However, this isn\'t a realistic option for

相关标签:

3条回答

野性不改

2020-12-06 12:04

Passing numpy arrays efficiently

Take a look at the file mlarray_sequence.py in the folder PYTHONPATH\Lib\site-packages\matlab\_internal. There you will find the construction of the MATLAB array object. The performance problem comes from copying data with loops within the generic_flattening function.

To avoid this behavior we will edit the file a bit. This fix should work on complex and non-complex datatypes.

Make a backup of the original file in case something goes wrong.
Add import numpy as np to the other imports at the beginning of the file

In line 38 you should find:

init_dims = _get_size(initializer)  # replace this with 
     try:
         init_dims=initializer.shape
     except:
         init_dims = _get_size(initializer)

In line 48 you should find:

if is_complex:
    complex_array = flat(self, initializer,
                         init_dims, typecode)
    self._real = complex_array['real']
    self._imag = complex_array['imag']
else:
    self._data = flat(self, initializer, init_dims, typecode)

#Replace this with:

if is_complex:
    try:
        self._real = array.array(typecode,np.ravel(initializer, order='F').real)
        self._imag = array.array(typecode,np.ravel(initializer, order='F').imag)
    except:
        complex_array = flat(self, initializer,init_dims, typecode)
        self._real = complex_array['real']
        self._imag = complex_array['imag']
else:
    try:
        self._data = array.array(typecode,np.ravel(initializer, order='F'))
    except:
        self._data = flat(self, initializer, init_dims, typecode)

Now you can pass a numpy array directly to the MATLAB array creation method.

data1 = np.random.uniform(low = 0.0, high = 30000.0, size = (1000000,))
#faster
data1m = matlab.double(data1)
#or slower method
data1m = matlab.double(data1.tolist())

data2 = np.random.uniform(low = 0.0, high = 30000.0, size = (1000000,)).astype(np.complex128)
#faster
data1m = matlab.double(data2,is_complex=True)
#or slower method
data1m = matlab.double(data2.tolist(),is_complex=True)

The performance in MATLAB array creation increases by a factor of 15 and the interface is easier to use now.

0 讨论(0)

予麋鹿

2020-12-06 12:14
My situation was a bit different (python script called from matlab) but for me converting the ndarray into an array.array massively speed up the process. Basically it is very similar to Alexandre Chabot solution but without the need to alter any files:
```
#untested i.e. only deducted from my "matlab calls python" situation
import numpy
import array

data1 = numpy.random.uniform(low = 0.0, high = 30000.0, size = (1000000,))
ar = array.array('d',data1.flatten('F').tolist())
p = matlab.double(ar)
C = matlab.reshape(p,data1.shape) #this part I am definitely not sure about if it will work like that
```
At least if done from Matlab the combination of "array.array" and "double" is relative fast. Tested with Matlab 2016b + python 3.5.4 64bit.
0 讨论(0)
发布评论:

提交评论
- 加载中...
盖世英雄少女心

2020-12-06 12:20
While awaiting better suggestions, I'll post the best trick I've come up with so far. It comes down to saving the file with `scipy.io.savemat´ and then loading this file in MATLAB.

This is not the prettiest hack and it requires some care to ensure different processes relying on the same script don't end up writing and loading each other's .mat files, but the performance gain is worth it for me.

As a test case I wrote two simple, almost identical MATLAB functions that require 2 numpy arrays (I tested with length 1000000) and one int as input.
```
function d = test(x, y, fs_signal)
d = sum((x + y))./double(fs_signal);

function d = test2(path)
load(path)
d = sum((x + y))./double(fs_signal);
```
The function test requires conversion, while test2 requires saving.

Testing test: Converting the two numpy arrays takes cirka 40 s on my system. The total time to prepare for and run test comes down to 170 s

Testing test2: Saving the arrays and int takes cirka 0.35 s on my system. Suprisingly, loading the .mat file in MATLAB is extremely efficient (or more suprisingly, it is extremely ineffcient at dealing with its doubles)... The total time to prepare for and run test2 comes down to 0.38 s

That's a performance gain of almost 450x...
0 讨论(0)
发布评论:

提交评论
- 加载中...