What is the best way to compute the trace of a matrix product in numpy?

前端 未结 3 788
甜味超标
甜味超标 2020-12-29 07:40

If I have numpy arrays A and B, then I can compute the trace of their matrix product with:

tr = numpy.linalg.trace(A.dot(B))


        
相关标签:
3条回答
  • 2020-12-29 08:02

    You can improve on @Bill's solution by reducing intermediate storage to the diagonal elements only:

    from numpy.core.umath_tests import inner1d
    
    m, n = 1000, 500
    
    a = np.random.rand(m, n)
    b = np.random.rand(n, m)
    
    # They all should give the same result
    print np.trace(a.dot(b))
    print np.sum(a*b.T)
    print np.sum(inner1d(a, b.T))
    
    %timeit np.trace(a.dot(b))
    10 loops, best of 3: 34.7 ms per loop
    
    %timeit np.sum(a*b.T)
    100 loops, best of 3: 4.85 ms per loop
    
    %timeit np.sum(inner1d(a, b.T))
    1000 loops, best of 3: 1.83 ms per loop
    

    Another option is to use np.einsum and have no explicit intermediate storage at all:

    # Will print the same as the others:
    print np.einsum('ij,ji->', a, b)
    

    On my system it runs slightly slower than using inner1d, but it may not hold for all systems, see this question:

    %timeit np.einsum('ij,ji->', a, b)
    100 loops, best of 3: 1.91 ms per loop
    
    0 讨论(0)
  • 2020-12-29 08:08

    Note that one slight variant is to take the dot product of the vectorized matrices. In python, vectorization is done using .flatten('F'). It's slightly slower than taking the sum of the Hadamard product, on my computer, so it's a worse solution than wflynny's , but I think it's kind of interesting, since it can be more intuitive, in some situations, in my opinion. For example, personally I find that for the matrix normal distribution, the vectorized solution is easier for me to understand.

    Speed comparison, on my system:

    import numpy as np
    import time
    
    N = 1000
    
    np.random.seed(123)
    A = np.random.randn(N, N)
    B = np.random.randn(N, N)
    
    tart = time.time()
    for i in range(10):
        C = np.trace(A.dot(B))
    print(time.time() - start, C)
    
    start = time.time()
    for i in range(10):
        C = A.flatten('F').dot(B.T.flatten('F'))
    print(time.time() - start, C)
    
    start = time.time()
    for i in range(10):
        C = (A.T * B).sum()
    print(time.time() - start, C)
    
    start = time.time()
    for i in range(10):
        C = (A * B.T).sum()
    print(time.time() - start, C)
    

    Result:

    6.246593236923218 -629.370798672
    0.06539678573608398 -629.370798672
    0.057890892028808594 -629.370798672
    0.05709719657897949 -629.370798672
    
    0 讨论(0)
  • 2020-12-29 08:25

    From wikipedia you can calculate the trace using the hadamard product (element-wise multiplication):

    # Tr(A.B)
    tr = (A*B.T).sum()
    

    I think this takes less computation than doing numpy.trace(A.dot(B)).

    Edit:

    Ran some timers. This way is much faster than using numpy.trace.

    In [37]: timeit("np.trace(A.dot(B))", setup="""import numpy as np;
    A, B = np.random.rand(1000,1000), np.random.rand(1000,1000)""", number=100)
    Out[38]: 8.6434469223022461
    
    In [39]: timeit("(A*B.T).sum()", setup="""import numpy as np;
    A, B = np.random.rand(1000,1000), np.random.rand(1000,1000)""", number=100)
    Out[40]: 0.5516049861907959
    
    0 讨论(0)
提交回复
热议问题