Efficient slicing of matrices using matrix multiplication, with Python, NumPy, SciPy

家住魔仙堡 提交于 2019-11-30 17:23:42

问题


I want to reshape a 2d scipy.sparse.csr.csr_matrix(let us call it A) to a 2d numpy.ndarray (let us call this B).

A could be

>shape(A)
(90, 10)

then B should be

>shape(B)
(9,10)

where each 10 rows of A would be reshaped in a new new value, namely the maximum of this window and column. The column operator is not working on this unhashable type of a sparse matrix. How can I get this B by using matrix multiplications?


回答1:


Using matrix multiplication you can do en efficient slicing creating a "slicer" matrix with ones at the right places. The sliced matrix will have the same type as the "slicer", so you can control in an efficient way your output type.

Below you will see some comparisons and the most efficient for you case is to ask for the .A matrix and slice it. It showed to be much faster than the .toarray() method. Using multiplication is the second fastest option when the "slicer" is created as a ndarray, multiplied by the csr matrix and slice the result .

OBS: using a coo sparse for matrix A resulted in a slightly slower timing, keeping the same proportions, and sol3 is not applicable, I realized later that in the multiplication it is converted to a csr automatically.

import scipy
import scipy.sparse.csr as csr
test = csr.csr_matrix([
[11,12,13,14,15,16,17,18,19],
[21,22,23,24,25,26,27,28,29],
[31,32,33,34,35,36,37,38,39],
[41,42,43,44,45,46,47,48,49],
[51,52,53,54,55,56,57,58,59],
[61,62,63,64,65,66,67,68,69],
[71,72,73,74,75,76,77,78,79],
[81,82,83,84,85,86,88,88,89],
[91,92,93,94,95,96,99,98,99]])

def sol1():
    B = test.A[2:5]

def sol2():
    slicer = scipy.array([[0,0,0,0,0,0,0,0,0],
                          [0,0,0,0,0,0,0,0,0],
                          [0,0,1,0,0,0,0,0,0],
                          [0,0,0,1,0,0,0,0,0],
                          [0,0,0,0,1,0,0,0,0]])
    B = (slicer*test)[2:]
    return B

def sol3():
    B = (test[2:5]).A
    return B

def sol4():
    slicer = csr.csr_matrix( ((1,1,1),((2,3,4),(2,3,4))), shape=(5,9) )
    B = ((slicer*test).A)[2:] # just changing when we do the slicing
    return B

def sol5():
    slicer = csr.csr_matrix( ((1,1,1),((2,3,4),(2,3,4))), shape=(5,9) )
    B = ((slicer*test)[2:]).A
    return B


timeit sol1()
#10000 loops, best of 3: 60.4 us per loop

timeit sol2()
#10000 loops, best of 3: 91.4 us per loop

timeit sol3()
#10000 loops, best of 3: 111 us per loop

timeit sol4()
#1000 loops, best of 3: 310 us per loop

timeit sol5()
#1000 loops, best of 3: 363 us per loop

EDIT: the answer has been updated replacing .toarray() by .A, giving much faster results and now the best solutions are placed on the top



来源:https://stackoverflow.com/questions/14477448/efficient-slicing-of-matrices-using-matrix-multiplication-with-python-numpy-s

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!