Very large matrices using Python and NumPy

后端 未结 11 2077
难免孤独
难免孤独 2020-11-22 13:51

NumPy is an extremely useful library, and from using it I\'ve found that it\'s capable of handling matrices which are quite large (10000 x 10000) easily, but begins to strug

11条回答
  •  陌清茗
    陌清茗 (楼主)
    2020-11-22 14:35

    numpy.arrays are meant to live in memory. If you want to work with matrices larger than your RAM, you have to work around that. There are at least two approaches you can follow:

    1. Try a more efficient matrix representation that exploits any special structure that your matrices have. For example, as others have already pointed out, there are efficient data structures for sparse matrices (matrices with lots of zeros), like scipy.sparse.csc_matrix.
    2. Modify your algorithm to work on submatrices. You can read from disk only the matrix blocks that are currently being used in computations. Algorithms designed to run on clusters usually work blockwise, since the data is scatted across different computers, and passed by only when needed. For example, the Fox algorithm for matrix multiplication (PDF file).

提交回复
热议问题