I am trying to subtract the values of a column in DataFrame A, from the values from a column in DataFrame B, but only if multiple column values are equal to each other.
This is why the Index
is so useful, subtraction will be aligned on the indices (both rows and columns).
dfA = dfA.set_index(['Department', 'Speciality', 'TargetMonth'])
dfB = dfB.set_index(['Department', 'Speciality', 'TargetMonth'])
dfA.sub(dfB.rename(columns={'Required': 'Capacity'}), fill_value=0)
Capacity
Department Speciality TargetMonth
IT Servers 2019-1 50
Sales Cars 2019-1 50
2019-2 0
Furniture 2019-1 60
I would use merge with keys:
For this solution taking your dataframe A as dfA & dataframe as dfB
df_result = pd.merge(dfA, dfB, how='inner', on=['Department','Speciality','TargetMonth'])
This will put the dataframes together based on the keys: ['Department','Speciality','TargetMonth'] and will result in a dataframe where the the keys appear in both dataframes (how = 'inner').
I.E. if there is a key in dfB that is:
{'Department': 'IT','Speciality':'Furniture','TargetMonth':2019-1}
This value will not appear in dataframe df_result. More Information can be found here - https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html
Then the solution using Pandas vectorization:
df_result['Result'] = df_result['Capacity'] - df_result['Required']