问题
I am trying to convert a simple pandas dataframe into a nested JSON file based on the answer I found here: pandas groupby to nested json
My grouped dataframe looks like this:
firstname lastname orgname phone mobile email
teamname members
1 0 John Doe Anon 916-555-1234 none john.doe@wildlife.net
1 Jane Doe Anon 916-555-4321 916-555-7890 jane.doe@wildlife.net
2 0 Mickey Moose Moosers 916-555-0000 916-555-1111 mickey.moose@wildlife.net
1 Minny Moose Moosers 916-555-2222 none minny.moose@wildlife.net
My code is:
data = pandas.read_excel(inputExcel, sheetname = 'Sheet1', encoding = 'utf8')
grouped = data.groupby(['teamname', 'members']).first()
results = defaultdict(lambda: defaultdict(dict))
for index, value in grouped.itertuples():
for i, key in enumerate(index):
if i ==0:
nested = results[key]
elif i == len(index) -1:
nested[key] = value
else:
nested = nested[key]
print json.dumps(results, indent = 4)
I get the following error on the first "for" loop. What causes this error in this circumstance and what would it take to fix it to output the nested json?
for index, value in grouped.itertuples():
ValueError: too many values to unpack
回答1:
When using itertuples(), the index is included as part of the tuple, so the for index, value in grouped.itertuples():
doesn't really make sense. In fact, itertuples()
uses namedtuple with Index
being one of the names.
Consider the following setup:
data = {'A': list('aabbc'), 'B': [0, 1, 0, 1, 0], 'C': list('vwxyz'), 'D': range(5,10)}
df = pd.DataFrame(data).set_index(['A', 'B'])
Yielding the following DataFrame:
C D
A B
a 0 v 5
1 w 6
b 0 x 7
1 y 8
c 0 z 9
Then printing each tuple in df.itertuples()
yields:
Pandas(Index=('a', 0), C='v', D=5)
Pandas(Index=('a', 1), C='w', D=6)
Pandas(Index=('b', 0), C='x', D=7)
Pandas(Index=('b', 1), C='y', D=8)
Pandas(Index=('c', 0), C='z', D=9)
So, what you'll probably want to do is something like the code below, with value
being replaced by t[1:]
:
for t in grouped.itertuples():
for i, key in enumerate(t.Index):
...
If you want to access components of the namedtuple
, you can access things positionally, or by name. So, in the case of your DataFrame, t[1]
and t.firstname
should be equivalent. Just remember that t[0]
is the index, so your first column starts at 1
.
回答2:
As I understand itertuples, it will return a tuple with the first value being the index and the remaining values being all of the columns. You only have for index, value in grouped.itertuples()
which means it's trying to unpack all of the columns into a single variable, which won't work. The groupby
probably comes into play as well but it should still contain all the values within the result which means you still have too many columns being unpacked.
来源:https://stackoverflow.com/questions/37819622/valueerror-too-many-values-to-unpack-when-using-itertuples-on-pandas-datafram