TypeError: string indices must be integers using pandas apply with lambda

前端 未结 2 1672
旧巷少年郎
旧巷少年郎 2021-01-13 23:55

I have a dataframe, one column is a URL, the other is a name. I\'m simply trying to add a third column that takes the URL, and creates an HTML link.

The column

2条回答
  •  [愿得一人]
    2021-01-14 00:36

    pd.Series.apply has access only to a single series, i.e. the series on which you are calling the method. In other words, the function you supply, irrespective of whether it is named or an anonymous lambda, will only have access to df['source'].

    To access multiple series by row, you need pd.DataFrame.apply along axis=1:

    def return_link(x):
        return '{1}'.format(x['url'], x['source'])
    
    df['sourceURL'] = df.apply(return_link, axis=1)
    

    Note there is an overhead associated with passing an entire series in this way; pd.DataFrame.apply is just a thinly veiled, inefficient loop.

    You may find a list comprehension more efficient:

    df['sourceURL'] = ['{1}'.format(i, j) \
                       for i, j in zip(df['url'], df['source'])]
    

    Here's a working demo:

    df = pd.DataFrame([['BBC', 'http://www.bbc.o.uk']],
                      columns=['source', 'url'])
    
    def return_link(x):
        return '{1}'.format(x['url'], x['source'])
    
    df['sourceURL'] = df.apply(return_link, axis=1)
    
    print(df)
    
      source                  url                              sourceURL
    0    BBC  http://www.bbc.o.uk  BBC
    

提交回复
热议问题