Compiling n submatrices into an NxN matrix in numpy

问题

Working on a problem in matrix structural analysis. I am writing a program in Python (using Anaconda 3) to analyze a truss. Each individual truss member generates one 4x4 matrix, for a total of n 4x4 matrices. Then, these 4x4 matrices are compiled into an NxN matrix, arranged like so, for matrices A, B, C:

As you can see, each successive submatrix is placed one row over and one row down from the preceding one. Further, because the size of the truss and the number of truss joints (nodes) is specified by the user, the size of the NxN matrix has to be determined dynamically (the submatrices are always 4x4).

I've got an NxN zero matrix; I am trying to figure out how to compile the submatrices correctly.

I found a few similar questions but none of them scaled the larger matrix dynamically.

I appreciate any assistance you folks can provide.

回答1:

Is n potentially large, so the result is a large sparse matrix with nonzero values concentrated along the diagonal? Sparse matrices are designed with this kind of matrix in mind (from FD and FE PDE problems). I did this a lot in MATLAB, and some with the scipy sparse module.

That module has a block definition mode that might work, but what I'm more familiar with is the coo to csr route.

In the coo format, nonzero elements are defined by 3 vectors, i, j, and data. You can collect all the values for A, B, etc in these arrays (applying the appropriate offset for the values in B etc), without worrying about overlaps. Then when that format is converted to csr (for matrix calculations) the overlapping values are summed - which is exactly what you want.

I think the sparse documentation has some simple examples of this. Conceptually the simplest thing to do is iterate over the n submatrices, and collect the values in those 3 arrays. But I also worked out a more complex system whereby it can be done as one big array operation, or by iterating over a smaller dimension. For example each submatrix has 16 values. In a realistic case 16 will be much smaller than n.

I'd have play around with code to give a more concrete example.

==========================

Here's a simple example with 3 blocks - functional, but not the most efficient

Define 3 blocks:

In [620]: A=np.ones((4,4),int)    
In [621]: B=np.ones((4,4),int)*2
In [622]: C=np.ones((4,4),int)*3

lists to collect values in; could be arrays, but it is easy, and relatively efficient to append or extend lists:

In [623]: i, j, dat = [], [], []

In [629]: def foo(A,n):
   # turn A into a sparse, and add it's points to the arrays
   # with an offset of 'n'
   ac = sparse.coo_matrix(A)
   i.extend(ac.row+n)
   j.extend(ac.col+n)
   dat.extend(ac.data)


In [630]: foo(A,0)

In [631]: i
Out[631]: [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3]    
In [632]: j
Out[632]: [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]

In [633]: foo(B,1)
In [634]: foo(C,2)  # do this in a loop in the real world

In [636]: M = sparse.csr_matrix((dat,(i,j)))

In [637]: M
Out[637]: 
<6x6 sparse matrix of type '<class 'numpy.int32'>'
    with 30 stored elements in Compressed Sparse Row format>

In [638]: M.A
Out[638]: 
array([[1, 1, 1, 1, 0, 0],
       [1, 3, 3, 3, 2, 0],
       [1, 3, 6, 6, 5, 3],
       [1, 3, 6, 6, 5, 3],
       [0, 2, 5, 5, 5, 3],
       [0, 0, 3, 3, 3, 3]], dtype=int32)

If I've done this right, overlapping values of A,B,C are summed.

More generally:

In [21]: def foo1(mats):
      i,j,dat = [],[],[]
      for n,mat in enumerate(mats):
          A = sparse.coo_matrix(mat)
          i.extend(A.row+n)
          j.extend(A.col+n)
          dat.extend(A.data)
      M = sparse.csr_matrix((dat,(i,j)))
      return M
   ....:   

In [22]: foo1((A,B,C,B,A)).A
Out[22]: 
array([[1, 1, 1, 1, 0, 0, 0, 0],
       [1, 3, 3, 3, 2, 0, 0, 0],
       [1, 3, 6, 6, 5, 3, 0, 0],
       [1, 3, 6, 8, 7, 5, 2, 0],
       [0, 2, 5, 7, 8, 6, 3, 1],
       [0, 0, 3, 5, 6, 6, 3, 1],
       [0, 0, 0, 2, 3, 3, 3, 1],
       [0, 0, 0, 0, 1, 1, 1, 1]], dtype=int32)

Coming up with a way of doing this more efficiently may depend on how the individual submatrices are generated. If they are created iteratively, you might as well collect the i,j,data values iteratively as well.

==========================

Since the submatrices are dense, we can get the appropriate i,j,data values directly, without going through a coo intermediary. And without looping if the A,B,C are collected into one larger array.

If I modify foo1 to return a coo matrix, I see the i,j,data lists (as arrays) as given, without summation of duplicates. In the example with 5 matrices, I get 80 element arrays, which can be reshaped as

In [110]: f.col.reshape(-1,16)
Out[110]: 
array([[0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3],
       [1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4],
       [2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5],
       [3, 4, 5, 6, 3, 4, 5, 6, 3, 4, 5, 6, 3, 4, 5, 6],
       [4, 5, 6, 7, 4, 5, 6, 7, 4, 5, 6, 7, 4, 5, 6, 7]], dtype=int32)

In [111]: f.row.reshape(-1,16)
Out[111]: 
array([[0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3],
       [1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4],
       [2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5],
       [3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6],
       [4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7]], dtype=int32)

In [112]: f.data.reshape(-1,16)
Out[112]: 
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
       [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
       [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])

I should be able generate those without a loop, especially the row and col.

In [143]: mats=[A,B,C,B,A]

the coordinates for the elements of an array

In [144]: I,J=[i.ravel() for i in np.mgrid[range(A.shape[0]),range(A.shape[1])]]

replicate them with offset via broadcasting

In [145]: x=np.arange(len(mats))[:,None]
In [146]: I=I+x    
In [147]: J=J+x

Collect the data into one large array:

In [148]: D=np.concatenate(mats,axis=0)

In [149]: f=sparse.csr_matrix((D.ravel(),(I.ravel(),J.ravel())))

or as a compact function

def foo3(mats):
    A = mats[0]
    n,m = A.shape
    I,J = np.mgrid[range(n), range(m)]
    x = np.arange(len(mats))[:,None]
    I = I.ravel()+x
    J = J.ravel()+x
    D=np.concatenate(mats,axis=0)
    f=sparse.csr_matrix((D.ravel(),(I.ravel(),J.ravel())))
    return f

In this modest example the 2nd version is 2x faster; the first scales linearly with the length of the list; the 2nd is almost independent of its length.

In [158]: timeit foo1(mats)
1000 loops, best of 3: 1.3 ms per loop

In [160]: timeit foo3(mats)
1000 loops, best of 3: 653 µs per loop

回答2:

The simple for-loop way would be to add each 4x4 matrix to an appropriate slice of the big zero matrix:

for i, small_m in enumerate(small_matrices):
    big_m[i:i+4, i:i+4] += small_m

You could also do this with no Python loops by creating a strided view of the zero matrix and using np.add.at for unbuffered addition. This should be particularly efficient if your 4x4 matrices are packed into a k-by-4-by-4 array:

import numpy as np
from numpy.lib.stride_tricks import as_strided

# Create a view of big_m with the shape of small_matrices.
# strided_view[i] is a view of big_m[i:i+4, i:i+4]
strides = (sum(big_m.strides),) + big_m.strides
strided_view = as_strided(big_m, shape=small_matrices.shape, strides=strides)

np.add.at(strided_view, np.arange(small_matrices.shape[0]), small_matrices)

来源：https://stackoverflow.com/questions/37039923/compiling-n-submatrices-into-an-nxn-matrix-in-numpy

标签

numpy

matrix

Anaconda