MemoryError when running Numpy Meshgrid

前端未结

关注

 3  733

I have 8823 data points with x,y coordinates. I\'m trying to follow the answer on how to get a scatter dataset to be represented as a heatmap but when I go

相关标签:

3条回答

无人及你

2020-12-15 14:43
When you call np.meshgrid for scatter figure, you need to normalize your data if it is too large to process, try this module
```
    # Feature Scaling
from sklearn.preprocessing import StandardScaler
st = StandardScaler()
X = st.fit_transform(X)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
感动是毒

2020-12-15 14:47

Your call to meshgrid requires a lot of memory -- it produces two 8823*8823 floating point arrays. Each of them are about 0.6 GB.

But your screen can't show (and your eye can't really process) that much information anyway, so you should probably think of a way to smooth your data to something more reasonable like 1024*1024 before you do this step.

0 讨论(0)
发布评论:

提交评论
- 加载中...
再見小時候

2020-12-15 14:50
in numpy 1.7.0 and newer meshgrid has the sparse keyword argument. A sparse meshgrid is setup so it broadcasts to a full meshgrid when used. This can save large amounts of memory e.g. when using the meshgrid to index arrays.
```
In [2]: np.meshgrid(np.arange(10), np.arange(10), sparse=True)
Out[2]: 
[array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]), array([[0],
    [1],
    [2],
    [3],
    [4],
    [5],
    [6],
    [7],
    [8],
    [9]])]
```
Another option is to use smaller integers that are still able to represent the range:
```
np.meshgrid(np.arange(10).astype(np.int8), np.arange(10).astype(np.int8),
            sparse=True, copy=False)
```
though as of numpy 1.9 using these smaller integers for indexing will be slower as they will internally be converted back to larger integers in small (np.setbufsize sized) chunks.
0 讨论(0)
发布评论:

提交评论
- 加载中...