Fastest way to calculate difference in all columns

前端 未结 4 1917
我在风中等你
我在风中等你 2020-12-10 09:38

I have a dataframe of all float columns. For example:

import numpy as np
import pandas as pd

df = pd.DataFrame(np.arange(12.0).reshape(3,4), columns=list(\'         


        
4条回答
  •  粉色の甜心
    2020-12-10 10:00

    import itertools
    df = pd.DataFrame(np.arange(12.0).reshape(3,4), columns=list('ABCD'))
    df_cols = df.columns.tolist()
    #build a index array of all the pairs need to do the subtraction
    idx = np.asarray(list(itertools.combinations(range(len(df_cols)),2))).T
    #build a new DF using the pairwise difference and column names
    df_new = pd.DataFrame(data=df.values[:,idx[0]]-df.values[:,idx[1]], 
                          columns=[''.join(e) for e in (itertools.combinations(df_cols,2))])
    
    df_new
    Out[43]: 
        AB   AC   AD   BC   BD   CD
    0 -1.0 -2.0 -3.0 -1.0 -2.0 -1.0
    1 -1.0 -2.0 -3.0 -1.0 -2.0 -1.0
    2 -1.0 -2.0 -3.0 -1.0 -2.0 -1.0
    

提交回复
热议问题