I have two DataFrames and I want to perform the same list of cleaning ops.
I realized I can merge into one, and to everything in one pass, but I am still curios
You are modifying copies of the dataframes rather than the original dataframes.
One way to deal with this issue is to use a dictionary. As a convenience, you can use pd.DataFrame.pipe together with dictionary comprehensions to modify your dictionaries.
def remove_nulls(df):
return df[df['A'].notnull()]
dfs = dict(enumerate([test_1, test_2]))
dfs = {k: v.pipe(remove_nulls) for k, v in dfs.items()}
print(dfs)
# {0: A B
# 0 1 15
# 1 8 49
# 2 5 34
# 3 6 44
# 4 0 63,
# 1: A B
# 1 3.0 100
# 2 6.0 200
# 3 4.0 300
# 4 9.0 400
# 5 0.0 500}
Note: In your result dfs[1]['A'] remains float: this is because np.nan is considered float and we have not triggered a conversion to int.