问题
I've data stored in pandas dataframe and I want to convert tat into a JSON format. Example data can be replicated using following code
data = {'Product':['A', 'B', 'A'],
'Zone':['E/A', 'A/N', 'E/A'],
'start':['08:00:00', '09:00:00', '12:00:00'],
'end':['12:30:00', '17:00:00', '17:40:00'],
'seq':['0, 1, 2 ,3 ,4','0, 1, 2 ,3 ,4', '0, 1, 2 ,3 ,4'],
'store':['Z',"'AS', 'S'", 'Z']
}
df = pd.DataFrame(data)
I've tried converting it into JSON format using following code
df_parsed = json.loads(df.to_json(orient="records"))
Output generated from above
[{'Product': 'A', 'Zone': 'E/A', 'start': '08:00:00', 'end': '17:40:00', 'seq': '0, 1, 2 ,3 ,4', 'store': 'Z'}, {'Product': 'B', 'Zone': 'A/N', 'start': '09:00:00', 'end': '17:00:00', 'seq': '0, 1, 2 ,3 ,4', 'store': 'AS'}, {'Product': 'A', 'Zone': 'E/A', 'start': '08:00:00', 'end': '17:40:00', 'seq': '0, 1, 2 ,3 ,4', 'store': 'Z'}]
Desired Result:
{
'A': {'Zone': 'E/A',
'tp': [{'start': [8, 0], 'end': [12, 0], 'seq': [0, 1, 2 ,3 ,4]},
{'start': [12, 30], 'end': [17, 40], 'seq': [0, 1, 2 ,3 ,4]}],
'store': ['Z']
},
'B': {'Zone': 'A/N',
'tp': [{'start': [9, 0], 'end': [17, 0], 'seq': [0, 1, 2 ,3 ,4]}],
'store': ['AS', 'S']
}
}
If a product belongs to same store the result for column start
, end
and seq
should be clubbed as shown in desired output. Also start time and end time should be represented like [9,0]
if value for time is "09:00:00"
only hour and minute needs to be represented so we can discard value of seconds from time
columns.
回答1:
This will be complicated a bit. So you have to do it step by step:
def funct(row):
row['start'] = row['start'].str.split(':').str[0:2]
row['end'] = row['end'].str.split(':').str[0:2]
row['store'] = row['store'].str.replace("'", "").str.split(', ')
d = (row.groupby('Zone')[row.columns[1:]]
.apply(lambda x: x.to_dict(orient='record'))
.reset_index(name='tp').to_dict(orient='row'))
return d
di = df.groupby(['Product'])[df.columns[1:]].apply(funct).to_dict()
di:
{'A': [{'Zone': 'E/A',
'tp': [{'start': ['08', '00'],
'end': ['12', '30'],
'seq': '0, 1, 2 ,3 ,4',
'store': ['Z']},
{'start': ['12', '00'],
'end': ['17', '40'],
'seq': '0, 1, 2 ,3 ,4',
'store': ['Z']}]}],
'B': [{'Zone': 'A/N',
'tp': [{'start': ['09', '00'],
'end': ['17', '00'],
'seq': '0, 1, 2 ,3 ,4',
'store': ['AS', 'S']}]}]}
Explanation:
- 1st create your own custom function.
- change the
start
,end
column to a list form. - group by
Zone
and apply to_dict to rest of the columns. - reset index and name the column that are having
[{'start': ['08', '00'], 'end': ['12', '30'], 'seq': '0, 1, 2 ,3 ,4',
astp
. - now apply to_dict to the whole result and return it.
Ultimately you need to convert your dataframe into this below format once you are able to do it the rest of the thing will become easy for you.
Zone tp
E/A [{'start': ['08', '00'], 'end': ['12', '30'], ...
A/N [{'start': ['09', '00'], 'end': ['17', '00'], ...
EDIT:
import pandas as pd
import ast
def funct(row):
y = row['start'].str.split(':').str[0:-1]
row['start'] = row['start'].str.split(':').str[0:2].apply(lambda x: list(map(int, x)))
row['end'] = row['end'].str.split(':').str[0:2].apply(lambda x: list(map(int, x)))
row['seq'] = row['seq'].apply(lambda x: list(map(int, ast.literal_eval(x))))
row['store'] = row['store'].str.replace("'", "")
d = (row.groupby('Zone')[row.columns[1:-1]]
.apply(lambda x: x.to_dict(orient='record'))
.reset_index(name='tp'))
######### For store create a different dataframe and then merge it to the other df ########
d1 = (row.groupby('Zone').agg({'store': pd.Series.unique}))
d1['store'] = d1['store'].str.split(",")
d_merged = (pd.merge(d,d1, on='Zone', how='left')).to_dict(orient='record')[0]
return d_merged
di = df.groupby(['Product'])[df.columns[1:]].apply(funct).to_dict()
di:
{'A': {'Zone': 'E/A',
'tp': [{'start': [8, 0], 'end': [12, 30], 'seq': [0, 1, 2, 3, 4]},
{'start': [12, 0], 'end': [17, 40], 'seq': [0, 1, 2, 3, 4]}],
'store': ['Z']},
'B': {'Zone': 'A/N',
'tp': [{'start': [9, 0], 'end': [17, 0], 'seq': [0, 1, 2, 3, 4]}],
'store': ['AS', ' S']}}
来源:https://stackoverflow.com/questions/65357356/converting-pandas-dataframe-to-json