Pandas populate new dataframe column based on matching columns in another dataframe

前端 未结 5 1680
花落未央
花落未央 2020-12-25 14:15

I have a df which contains my main data which has one million rows. My main data also has 30 columns. Now I want to add another column

5条回答
  •  盖世英雄少女心
    2020-12-25 14:43

    Consider the following dataframes df and df2

    df = pd.DataFrame(dict(
            AUTHOR_NAME=list('AAABBCCCCDEEFGG'),
            title=      list('zyxwvutsrqponml')
        ))
    
    df2 = pd.DataFrame(dict(
            AUTHOR_NAME=list('AABCCEGG'),
            title      =list('zwvtrpml'),
            CATEGORY   =list('11223344')
        ))
    

    option 1
    merge

    df.merge(df2, how='left')
    

    option 2
    join

    cols = ['AUTHOR_NAME', 'title']
    df.join(df2.set_index(cols), on=cols)
    

    both options yield

提交回复
热议问题