Group and average NumPy matrix

前端 未结 4 1071
一整个雨季
一整个雨季 2021-02-20 07:46

Say I have an arbitrary numpy matrix that looks like this:

arr = [[  6.0   12.0   1.0]
       [  7.0   9.0   1.0]
       [  8.0   7.0   1.0]
       [  4.0   3.0          


        
相关标签:
4条回答
  • 2021-02-20 08:07

    A compact solution is to use numpy_indexed (disclaimer: I am its author), which implements a fully vectorized solution:

    import numpy_indexed as npi
    npi.group_by(arr[:, 2]).mean(arr)
    
    0 讨论(0)
  • 2021-02-20 08:15
    arr = np.array(
    [[  6.0,   12.0,   1.0],
     [  7.0,   9.0,   1.0],
     [  8.0,   7.0,   1.0],
     [  4.0,   3.0,   2.0],
     [  6.0,   1.0,   2.0],
     [  2.0,   5.0,   2.0],
     [  9.0,   4.0,   3.0],
     [  2.0,   1.0,   4.0],
     [  8.0,   4.0,   4.0],
     [  3.0,   5.0,   4.0]])
    np.array([a.mean(0) for a in np.split(arr, np.argwhere(np.diff(arr[:, 2])) + 1)])
    
    0 讨论(0)
  • 2021-02-20 08:16

    You can do:

    for x in sorted(np.unique(arr[...,2])):
        results.append([np.average(arr[np.where(arr[...,2]==x)][...,0]), 
                        np.average(arr[np.where(arr[...,2]==x)][...,1]),
                        x])
    

    Testing:

    >>> arr
    array([[  6.,  12.,   1.],
           [  7.,   9.,   1.],
           [  8.,   7.,   1.],
           [  4.,   3.,   2.],
           [  6.,   1.,   2.],
           [  2.,   5.,   2.],
           [  9.,   4.,   3.],
           [  2.,   1.,   4.],
           [  8.,   4.,   4.],
           [  3.,   5.,   4.]])
    >>> results=[]
    >>> for x in sorted(np.unique(arr[...,2])):
    ...     results.append([np.average(arr[np.where(arr[...,2]==x)][...,0]), 
    ...                     np.average(arr[np.where(arr[...,2]==x)][...,1]),
    ...                     x])
    ... 
    >>> results
    [[7.0, 9.3333333333333339, 1.0], [4.0, 3.0, 2.0], [9.0, 4.0, 3.0], [4.333333333333333, 3.3333333333333335, 4.0]]
    

    The array arr does not need to be sorted, and all the intermediate arrays are views (ie, not new arrays of data). The average is calculated efficiently directly from those views.

    0 讨论(0)
  • 2021-02-20 08:23

    solution

    from itertools import groupby
    from operator import itemgetter
    
    arr = [[6.0, 12.0, 1.0],
           [7.0, 9.0, 1.0],
           [8.0, 7.0, 1.0],
           [4.0, 3.0, 2.0],
           [6.0, 1.0, 2.0],
           [2.0, 5.0, 2.0],
           [9.0, 4.0, 3.0],
           [2.0, 1.0, 4.0],
           [8.0, 4.0, 4.0],
           [3.0, 5.0, 4.0]]
    
    result = []
    
    for groupByID, rows in groupby(arr, key=itemgetter(2)):
        position1, position2, counter = 0, 0, 0
        for row in rows:
            position1+=row[0]
            position2+=row[1]
            counter+=1
        result.append([position1/counter, position2/counter, groupByID])
    
    print(result)
    

    would output:

    [[7.0, 9.333333333333334, 1.0]]
    [[4.0, 3.0, 2.0]]
    [[9.0, 4.0, 3.0]]
    [[4.333333333333333, 3.3333333333333335, 4.0]]
    
    0 讨论(0)
提交回复
热议问题