ndim in numpy array loaded with scipy.io.loadmat?

旧巷老猫 提交于 2019-12-22 17:56:14

问题


Using SciPy and MATLAB, I'm having trouble reconstructing an array to match what is given from a MATLAB cell array loaded using scipy.io.loadmat().

For example, say I create a cell containing a pair of double arrays in MATLAB and then load it using scipy.io (I'm using SPM to do imaging analyses in conjunction with pynifti and the like)

MATLAB

>> onsets{1} = [0 30 60 90]
>> onsets{2} = [15 45 75 105]

Python

>>> import scipy.io as scio
>>> mat = scio.loadmat('onsets.mat')
>>> mat['onsets'][0]
array([[[ 0 30 60 90]], [[ 15  45  75 105]]], dtype=object)

>>> mat['onsets'][0].shape

(2,)

My question is this: Why does this numpy array have the shape (2,) instead of (2,1,4)? In real life I'm trying to use Python to parse a logfile and build these onsets cell arrays, so I'd like to be able to build them from scratch.

When I try to build the same array from the printed output, I get a different shape back:

>>> new_onsets = array([[[ 0, 30, 60, 90]], [[ 15,  45,  75, 105]]], dtype=object)
array([[[0, 30, 60, 90]],

       [[15, 45, 75, 105]]], dtype=object)

>>> new_onsets.shape
(2,1,4)

Unfortunately, the shape (vectors of doubles in a cell array) is coded in a spec upstream, so I need to be able to get this saved exactly in this format. Of course, it's not a big deal since I could just write the parser in MATLAB, but it would be nice to figure out what's going on and add a little to my [minuscule] knowledge of numpy.


回答1:


This is one of those things I personally find kind of annoying in python. It is because loadmat automatically "squeezes" dimensions.

By default, squeeze_me=True so as you've seen you get this:

>>> x = sio.loadmat('mymat.mat',squeeze_me=True)
>>> y = x['onsets']
>>> y.shape
(2,)

If you use loadmat with squeeze_me set to False then you don't get one dimension squeezed out:

>>> a = sio.loadmat('mymat.mat',squeeze_me=False)
>>> a
>>> b = a['onsets']
>>> b.shape
(1, 2)

That said, I can't for the life of me figure out how to get another dimension to show up (that is, b.shape = (1,2,4)) for a cell array like 'onsets'. I've only been able to get it for non-cell plain-old vanilla MATLAB arrays

onset_array = [onsets{1}; onsets{2}];



回答2:


Travis from the scipy mailing list responded that the right way to build this is to create the structure first, then populate the arrays:

http://article.gmane.org/gmane.comp.python.scientific.user/31760

> You could build what you saw before with: 
> 
> new_onsets = empty((2,), dtype=object) 
> new_onsets[0] = array([[0, 30, 60, 90]]) 
> new_onsets[1] = array([[15, 45, 75, 105]])



回答3:


I think the problem here is that cell arrays aren't really arrays, which is why scio.loadmat loads onsets.mat to an object array.

Here, your cell array could be reduced to a normal array of shape (2,1,4), but what if, instead, your data looked like:

>> onsets{1} = {0 30 60 'bob'}
>> onsets{2} = {15 45 75 'fred'}

I'm not sure what the best solution is, but if you know your data is an array, you should probably convert to a normal array before saving in Matlab, or after loading with Scipy.

Edit: The example cell array above could, in theory, be cast into a numpy structured array, but note that's not generally true of cell arrays because the columns don't have to be the same data type. The logical way to represent lists of arbitrary data types is with a Python list (or an array of lists, here), which is what loadmat returns.

Edit 2: Fix cell array syntax, as suggested by Erik Kastman.



来源:https://stackoverflow.com/questions/10542263/ndim-in-numpy-array-loaded-with-scipy-io-loadmat

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!