I have a df which contains my main data which has one million rows. My main data also has 30 columns. Now I want to add another column
Consider the following dataframes df and df2
df = pd.DataFrame(dict(
AUTHOR_NAME=list('AAABBCCCCDEEFGG'),
title= list('zyxwvutsrqponml')
))
df2 = pd.DataFrame(dict(
AUTHOR_NAME=list('AABCCEGG'),
title =list('zwvtrpml'),
CATEGORY =list('11223344')
))
option 1
merge
df.merge(df2, how='left')
option 2
join
cols = ['AUTHOR_NAME', 'title']
df.join(df2.set_index(cols), on=cols)
both options yield