Numpy Mean Structured Array

前端未结

关注

 4  767

Suppose that I have a structured array of students (strings) and test scores (ints), where each entry is the score that a specific student received on a specific test. Each

相关标签:

4条回答

春和景丽

2020-12-21 05:18

A little bit faster and simpler solution based on itertools, without using view(), is

[(k,e['score'][list(g)].mean()) for k, g in groupby(argsort(e),e['student'].__getitem__ )]

This is the same idea of ecatmur, but works in terms of indices employing argsort() instead of sort.

0 讨论(0)
发布评论:

提交评论
- 加载中...
后悔当初

2020-12-21 05:20
NumPy isn't designed to be able to group rows together and apply aggregate functions to those groups. You could:
- use itertools.groupby and reconstruct the array;
- use Pandas, which is based on NumPy and is great at grouping; or
- add another dimension to the array for the test id (so this case would be a 2x3 array, because it looks like there were two tests).
Here's the itertools solution, but as you can see it's quite complicated and inefficient. I'd recommend one of the other two methods.
```
np.array([(k, np.array(list(g), dtype=grades.dtype).view(np.recarray)['score'].mean())
          for k, g in groupby(np.sort(grades, order='student').view(np.recarray),
                              itemgetter('student'))], dtype=grades.dtype)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

名媛妹妹

2020-12-21 05:21

collapseByField(grades,'student') gives what you want, after:

def collapseByField(e,collapsefield,keepFields=None,agg=None):
   import numpy as np
   assert isinstance(e,np.ndarray) # Structured array
   if agg is None:
       agg=np.mean
   if keepFields is None:
       newf=[(n,agg,n) for n in e.dtype.names if n not in (collapsefield)]
   import matplotlib as mpl
   return(mpl.mlab.rec_groupby(e,[collapsefield],newf))

0 讨论(0)

梦如初夏

2020-12-21 05:27

matplotlib.mlab.rec_groupby was exactly what I was looking for.

0 讨论(0)
发布评论:

提交评论
- 加载中...