Subtract two Pandas DataFrames joined on multiple column values

前端 未结 2 1043
长发绾君心
长发绾君心 2021-01-22 15:45

I am trying to subtract the values of a column in DataFrame A, from the values from a column in DataFrame B, but only if multiple column values are equal to each other.

相关标签:
2条回答
  • This is why the Index is so useful, subtraction will be aligned on the indices (both rows and columns).

    dfA = dfA.set_index(['Department', 'Speciality', 'TargetMonth'])
    dfB = dfB.set_index(['Department', 'Speciality', 'TargetMonth'])
    
    dfA.sub(dfB.rename(columns={'Required': 'Capacity'}), fill_value=0)
    
                                       Capacity
    Department Speciality TargetMonth          
    IT         Servers    2019-1             50
    Sales      Cars       2019-1             50
                          2019-2              0
               Furniture  2019-1             60
    
    0 讨论(0)
  • 2021-01-22 16:06

    I would use merge with keys:

    For this solution taking your dataframe A as dfA & dataframe as dfB

       df_result =  pd.merge(dfA, dfB, how='inner', on=['Department','Speciality','TargetMonth'])
    

    This will put the dataframes together based on the keys: ['Department','Speciality','TargetMonth'] and will result in a dataframe where the the keys appear in both dataframes (how = 'inner').

    I.E. if there is a key in dfB that is:

       {'Department': 'IT','Speciality':'Furniture','TargetMonth':2019-1}
    

    This value will not appear in dataframe df_result. More Information can be found here - https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html

    Then the solution using Pandas vectorization:

       df_result['Result'] = df_result['Capacity'] - df_result['Required']
    
    0 讨论(0)
提交回复
热议问题