Aggregate all dataframe row pair combinations using pandas

前端未结

关注

 2  814

你的背包 2020-12-19 04:16

I use python pandas to perform grouping and aggregation across data frames, but I would like to now perform specific pairwise aggregation of rows (n choose 2, statistical co

2条回答

梦毁少年i (楼主)

2020-12-19 04:56

I can't think of a clever vectorized way to do this, but unless performance is a real bottleneck I tend to use the simplest thing which makes sense. In this case, I might set_index("Gene") and then use loc to pick out the rows:

>>> df = df.set_index("Gene")
>>> cc = list(combinations(mygenes,2))
>>> out = pd.DataFrame([df.loc[c,:].sum() for c in cc], index=cc)
>>> out
              case1  case2  control1  control2
(ABC1, ABC2)      1      2         0         1
(ABC1, ABC3)      1      2         1         1
(ABC1, ABC4)      0      1         1         2
(ABC2, ABC3)      2      2         1         0
(ABC2, ABC4)      1      1         1         1
(ABC3, ABC4)      1      1         2         1

0 讨论(0)

查看其它2个回答