how to transform dataframe that contains list in every row of each column

早过忘川 提交于 2020-01-03 02:57:09

问题


I have the following dataframe which is one of the output from for loop.

df = pd.DataFrame()

df['Score'] = [['0-0','1-1','2-2'],['0-0','1-1','2-2']]
df ['value'] =[[0.08,0.1,0.15],[0.07,0.12,0.06]]
df ['Team'] = ['A','B']

I want to transform each element of list of each row to each element of a column. The following is the expected output.

Can anyone help me how to transform it?

Thanks,

Zep


回答1:


You can try of unstacking index, once after applying pd.Series on each list of dataframe

df = pd.DataFrame()

df['Score'] = [['0-0','1-1','2-2'],['0-0','1-1','2-2']]
df ['value'] =[[0.08,0.1,0.15],[0.07,0.12,0.06]]    

df.stack().apply(pd.Series).ffill(1).unstack(level=0).T.reset_index(drop=True)

Out:

    Score   value   Team
0   0-0     0.08    A
1   0-0     0.07    B
2   1-1     0.1     A
3   1-1     0.12    B
4   2-2     0.15    A
5   2-2     0.06    B



回答2:


Use np.concatenate :

import pandas as pd 
import numpy as np 

x = [['0-0','1-1','2-2'],['0-0','1-1','2-2']]
y = [[0.08,0.1,0.15],[0.07,0.12,0.06]]
z = ['A','B']
df = pd.DataFrame()

df['Score'] = np.concatenate(x)
df ['value'] = np.concatenate(y)
df['Team'] = np.repeat(z, len(df)/len(z))
print(df)

Output:

  Score  value Team                                                                                                                          
0   0-0   0.08    A                                                                                                                          
1   1-1   0.10    A                                                                                                                          
2   2-2   0.15    A                                                                                                                          
3   0-0   0.07    B                                                                                                                          
4   1-1   0.12    B                                                                                                                          
5   2-2   0.06    B   



回答3:


You first need to flatten the ists, you can use itertools.chain:

from itertools import chain
score = list(chain(*[['0-0','1-1','2-2'],['0-0','1-1','2-2']]))
value = list(chain(*[[0.08,0.1,0.15],[0.07,0.12,0.06]]))

pd.DataFrame({'score':score, 'value':value})

Score  value
0   0-0   0.08
1   1-1   0.10
2   2-2   0.15
3   0-0   0.07
4   1-1   0.12
5   2-2   0.06



回答4:


You could use chain.from_iterable to flatten the input:

from itertools import chain

import pandas as pd

data = [['0-0','1-1','2-2'],['0-0','1-1','2-2']]
values = [[0.08,0.1,0.15],[0.07,0.12,0.06]]

df = pd.DataFrame(data=list(zip(chain.from_iterable(data), chain.from_iterable(values))), columns=['score', 'value'])
print(df)

Output

  score  value
0   0-0   0.08
1   1-1   0.10
2   2-2   0.15
3   0-0   0.07
4   1-1   0.12
5   2-2   0.06

As an alternative you could use np.ravel:

import numpy as np
import pandas as pd

data = [['0-0', '1-1', '2-2'], ['0-0', '1-1', '2-2']]
values = [[0.08, 0.1, 0.15], [0.07, 0.12, 0.06]]

df = pd.DataFrame({'score': np.array(data).ravel(), 'value': np.array(values).ravel()})
print(df)


来源:https://stackoverflow.com/questions/54576302/how-to-transform-dataframe-that-contains-list-in-every-row-of-each-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!