Access a list within an element of a Pandas DataFrame

和自甴很熟 提交于 2020-07-18 14:24:18

问题


I have a Pandas DataFrame which has a list of integers inside one of the columns. I'd like to access the individual elements within this list. I've found a way to do it by using tolist() and turning it back into a DataFrame, but I am wondering if there is a simpler/better way. In this example, I add Column A to the middle element of the list in Column B.

import pandas as pd
df = pd.DataFrame({'A' : (1,2,3), 'B': ([0,1,2],[3,4,5,],[6,7,8])})
df['C'] = df['A'] + pd.DataFrame(df['B'].tolist())[1]
df

Is there a better way to do this?


回答1:


A bit more straightforward is:

df['C'] = df['A'] + df['B'].apply(lambda x:x[1])



回答2:


One option is to use the apply, which should be faster than creating a data frame out of it:

df['C'] = df['A'] + df.apply(lambda row: row['B'][1], axis = 1) 

Some speed test:

%timeit df['C'] = df['A'] + pd.DataFrame(df['B'].tolist())[1]
# 1000 loops, best of 3: 567 µs per loop
%timeit df['C'] = df['A'] + df.apply(lambda row: row['B'][1], axis = 1) 
# 1000 loops, best of 3: 406 µs per loop
%timeit df['C'] = df['A'] + df['B'].apply(lambda x:x[1])
# 1000 loops, best of 3: 250 µs per loop

OK. Slightly better. @breucopter's answer is the fastest.




回答3:


You can also simply try the following:

df['C'] = df['A'] + df['B'].str[1]

Performance of this method:

%timeit df['C'] = df['A'] + df['B'].str[1]
#1000 loops, best of 3: 445 µs per loop


来源:https://stackoverflow.com/questions/38088419/access-a-list-within-an-element-of-a-pandas-dataframe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!