I have 2 datasets (in CSV format) with different size such as follow:
df_old:
index category text 0 spam you win much money 1 spam y