ValueError: too many values to unpack when using itertuples() on pandas dataframe

大城市里の小女人 提交于 2019-12-10 18:48:54

问题


I am trying to convert a simple pandas dataframe into a nested JSON file based on the answer I found here: pandas groupby to nested json

My grouped dataframe looks like this:

                  firstname lastname  orgname         phone        mobile  email
teamname members                                                           
1        0            John      Doe     Anon  916-555-1234          none   john.doe@wildlife.net 
         1            Jane      Doe     Anon  916-555-4321  916-555-7890   jane.doe@wildlife.net
2        0          Mickey    Moose  Moosers  916-555-0000  916-555-1111   mickey.moose@wildlife.net
         1           Minny    Moose  Moosers  916-555-2222          none   minny.moose@wildlife.net  

My code is:

data = pandas.read_excel(inputExcel, sheetname = 'Sheet1', encoding = 'utf8')
grouped = data.groupby(['teamname', 'members']).first()

results = defaultdict(lambda: defaultdict(dict))

for index, value in grouped.itertuples():
    for i, key in enumerate(index):
        if i ==0:
            nested = results[key]
        elif i == len(index) -1:
            nested[key] = value
        else:
            nested = nested[key]

print json.dumps(results, indent = 4)

I get the following error on the first "for" loop. What causes this error in this circumstance and what would it take to fix it to output the nested json?

    for index, value in grouped.itertuples():
ValueError: too many values to unpack

回答1:


When using itertuples(), the index is included as part of the tuple, so the for index, value in grouped.itertuples(): doesn't really make sense. In fact, itertuples() uses namedtuple with Index being one of the names.

Consider the following setup:

data = {'A': list('aabbc'), 'B': [0, 1, 0, 1, 0], 'C': list('vwxyz'), 'D': range(5,10)}
df = pd.DataFrame(data).set_index(['A', 'B'])

Yielding the following DataFrame:

     C  D
A B      
a 0  v  5
  1  w  6
b 0  x  7
  1  y  8
c 0  z  9

Then printing each tuple in df.itertuples() yields:

Pandas(Index=('a', 0), C='v', D=5)
Pandas(Index=('a', 1), C='w', D=6)
Pandas(Index=('b', 0), C='x', D=7)
Pandas(Index=('b', 1), C='y', D=8)
Pandas(Index=('c', 0), C='z', D=9)

So, what you'll probably want to do is something like the code below, with value being replaced by t[1:]:

for t in grouped.itertuples():
    for i, key in enumerate(t.Index):
        ...

If you want to access components of the namedtuple, you can access things positionally, or by name. So, in the case of your DataFrame, t[1] and t.firstname should be equivalent. Just remember that t[0] is the index, so your first column starts at 1.




回答2:


As I understand itertuples, it will return a tuple with the first value being the index and the remaining values being all of the columns. You only have for index, value in grouped.itertuples() which means it's trying to unpack all of the columns into a single variable, which won't work. The groupby probably comes into play as well but it should still contain all the values within the result which means you still have too many columns being unpacked.



来源:https://stackoverflow.com/questions/37819622/valueerror-too-many-values-to-unpack-when-using-itertuples-on-pandas-datafram

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!