问题
I am trying to concat all my columns into a new column. The concatenated values should be stored in a list.
My dataframe:
df = pd.DataFrame({'A': ['1', '2', nan],
'B': [nan, '5', nan],
'C': ['7', nan, '9']})
desired output:
df:
A B C concat_col
1 nan 7 [1,7]
2 5 nan [2,5]
nan nan 9 [9]
What i tried:
df['concat'] = pd.Series(df.fillna('').values.tolist()).str.join(',')
Output i got:
A B C concat_col
1 nan 7 1,,7
2 5 nan 2,5,,
nan nan 9 ,,9
回答1:
The following code should work:
df['concat_col']=df.apply(lambda row: row.dropna().tolist(), axis=1)
回答2:
You can use a list comprehension, taking advantage of the fact np.nan != np.nan
:
df['D'] = [[i for i in row if i == i] for row in df.values]
print(df)
A B C D
0 1 NaN 7 [1, 7]
1 2 5 NaN [2, 5]
2 NaN NaN 9 [9]
Counter-intuitively, this is more efficient than Pandas methods:
df = pd.concat([df]*10000, ignore_index=True)
%timeit df.apply(lambda row: row.dropna().tolist(), axis=1) # 8.25 s
%timeit [[i for i in row if i == i] for row in df.values] # 55.6 ms
来源:https://stackoverflow.com/questions/52507272/pandas-concatenate-values-of-all-column-into-a-new-column-list