Applying a custom groupby aggregate function to find average of Numpy Array

前端 未结 2 1274
面向向阳花
面向向阳花 2020-12-12 03:46

I am having a pandas DataFrame where B contains NumPy list of fixed size.

|------|---------------|-------|
|  A   |       B       |   C   |
|------|--------         


        
2条回答
  •  误落风尘
    2020-12-12 04:43

    Dummy data

    size,list_size = 10,5
    data = [{'C':random.randint(95,100), 
             'B':[random.randint(0,10) for i in range(list_size)]} for j in range(size)]
    df = pd.DataFrame(data)
    

    Custom Aggregation Using numpy

    unique_C = df.C.unique()
    data_calculated  = []
    axis = 0
    
    for c in unique_C:
        arr = np.reshape(np.hstack(df[df.C==c]['B']),(-1,list_size))
        mean, std = arr.mean(axis=axis), arr.std(axis=axis)  # other aggergation can also be added
        data_calculated.append(dict(C=t,B_mean=mean, B_std=std))
    new_df = pd.DataFrame(data_calculated)
    

提交回复
热议问题