问题
I'm trying to write a subclass a masked_array. What I've got so far is this:
class gridded_array(ma.core.masked_array):
def __init__(self, data, dimensions, mask=False, dtype=None,
copy=False, subok=True, ndmin=0, fill_value=None,
keep_mask=True, hard_mask=None, shrink=True):
ma.core.masked_array.__init__(data, mask, dtype, copy, subok,
ndmin, fill_value, keep_mask, hard_mask,
shrink)
self.dimensions = dimensions
However, when now I create a gridded_array, I don't get what I expect:
dims = OrderedDict()
dims['x'] = np.arange(4)
gridded_array(np.random.randn(4), dims)
masked_array(data = [-- -- -- --],
mask = [ True True True True],
fill_value = 1e+20)
I would expect an unmasked array. I have the suspicion that the dimensions argument I'm passing gets passed on the the masked_array.__init__ call, but since I'm quite new to OOP, I don't know how to resolve this.
Any help is greatly appreciated.
PS: I'm on Python 2.7
回答1:
A word of warning: if you're brand new to OOP, subclassing ndarrays and MaskedArrays is not the easiest way to get started, by far...
Before anything else, you should go and check this tutorial. That should introduce you to the mechanisms involved in subclassing ndarrays.
MaskedArrays, like ndarrays, uses the __new__ method for creating class instances, not the __init__. By the time you get to the __init__ of your subclass, you already have a fully instanciated object, with the actual initialization delegated to the __array_finalize__ method. In simpler terms: your __init__ doesn't work as you would expect with standard Python object. (actually, I wonder whether it's called at all... After __array_finalize__, if I recall correctly...)
Now that you've been warned, you may want to consider whether you really need to go through the hassle of subclassing a ndarray:
- What are your objectives with your
gridded_array? - Should you support all methods of
ndarrays, or only some? All dtypes? - What should happen when you take a single element or a slice of your object?
- Will you be using
gridded_arraysextensively as inputs of NumPy functions ?
If you have a doubt, then it might be easier to design gridded_array as a generic class that takes a ndarray (or a MaskedArray) as attribute (say, gridded_array._array), and add only the methods you would need to operate on your self._array.
Suggestions
- If you only need to "tag" each item of your
gridded_array, you may be interested in pandas. - If you only have to deal with floats,
MaskedArraymight be a bit overkill: just usenansto represent invalid data, a lot of numpy functions havenansequivalent. At worst, you can always mask yourgridded_arraywhen needed: taking a view of a subclass ofndarraywith.view(np.ma.MaskedArray)should return a masked version of your input...
回答2:
The issue is that masked_array uses __new__ instead of __init__, so your dimensions argument is being misinterpreted.
To override __new__, use:
class gridded_array(ma.core.masked_array):
def __new__(cls, data, dimensions, *args, **kwargs):
self = super(gridded_array, cls).__new__(cls, data, *args, **kwargs)
self.dimensions = dimensions
return self
来源:https://stackoverflow.com/questions/12597827/how-to-subclass-numpy-ma-core-masked-array