Create a pandas dataframe from a nested lists of unequal lengths

眉间皱痕 提交于 2020-01-11 09:16:46

问题


So I have a list as follows:

aa = ['aa1', 'aa2', 'aa3', 'aa4', 'aa5']
bb = ['bb1', 'bb2', 'bb3', 'bb4']
cc = ['cc1', 'cc2', 'cc3']

Which is then created into a nested list:

nest = [aa, bb, cc]

I want to create a dataframe as follows:

aa   bb   cc
aa1  bb1  cc1
aa2  bb2  cc2
aa3  bb3  cc3
aa4  bb4  nan
aa5  nan  nan

I've tried:

pd.DataFrame(nest, columns=['aa', 'bb', cc'])

But results is such that, each list is being written as a row (as opposed to a column)


回答1:


The zip_longest function from itertools does this:

>>> import itertools, pandas
>>> pandas.DataFrame((_ for _ in itertools.zip_longest(*nest)), columns=['aa', 'bb', 'cc'])
    aa    bb    cc
0  aa1   bb1   cc1
1  aa2   bb2   cc2
2  aa3   bb3   cc3
3  aa4   bb4  None
4  aa5  None  None

If you have an older version of pandas you may need to wrap zip_longest in a list constructor. On older Python you may need to call izip_longest instead of zip_longest.




回答2:


Option 1

pd.DataFrame(nest, ['aa', 'bb', 'cc']).T

    aa    bb    cc
0  aa1   bb1   cc1
1  aa2   bb2   cc2
2  aa3   bb3   cc3
3  aa4   bb4  None
4  aa5  None  None

Option 2
Homebrew zip_longest

f = lambda x, n: x[n] if n < len(x) else None
n, m = max(map(len, nest)), len(nest)

pd.DataFrame(
    [[f(j, i) for j in nest] for i in range(n)],
    columns=['aa', 'bb', 'cc']
)

    aa    bb    cc
0  aa1   bb1   cc1
1  aa2   bb2   cc2
2  aa3   bb3   cc3
3  aa4   bb4  None
4  aa5  None  None



回答3:


Or maybe

pd.DataFrame(data={'value':nest},index=['aa', 'bb', 'cc']).value.apply(pd.Series).T
Out[1297]: 
    aa   bb   cc
0  aa1  bb1  cc1
1  aa2  bb2  cc2
2  aa3  bb3  cc3
3  aa4  bb4  NaN
4  aa5  NaN  NaN


来源:https://stackoverflow.com/questions/46431660/create-a-pandas-dataframe-from-a-nested-lists-of-unequal-lengths

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!