Inverse of Pandas json_normalize

牧云@^-^@ 提交于 2019-12-18 09:20:22

问题


I just discovered the json_normalize function which works great in taking a JSON object and giving me a pandas Dataframe. Now I want the reverse operation which takes that same Dataframe and gives me a json (or json-like dictionary which I can easily turn to json) with the same structure as the original json.

Here's an example: https://hackersandslackers.com/json-into-pandas-dataframes/.

They take a JSON object (or JSON-like python dictionary) and turn it into a dataframe, but I now want to take that dataframe and turn it back into a JSON-like dictionary (to later dump to json file).


回答1:


I implemented it with a couple functions

def set_for_keys(my_dict, key_arr, val):
    """
    Set val at path in my_dict defined by the string (or serializable object) array key_arr
    """
    current = my_dict
    for i in range(len(key_arr)):
        key = key_arr[i]
        if key not in current:
            if i==len(key_arr)-1:
                current[key] = val
            else:
                current[key] = {}
        else:
            if type(current[key]) is not dict:
                print("Given dictionary is not compatible with key structure requested")
                raise ValueError("Dictionary key already occupied")

        current = current[key]

    return my_dict

def to_formatted_json(df, sep="."):
    result = []
    for _, row in df.iterrows():
        parsed_row = {}
        for idx, val in row.iteritems():
            keys = idx.split(sep)
            parsed_row = set_for_keys(parsed_row, keys, val)

        result.append(parsed_row)
    return result


#Where df was parsed from json-dict using json_normalize
to_formatted_json(df, sep=".")



回答2:


df.to_json(path)

or

df.to_dict()



回答3:


let me throw in my two cents

after backward converting you might need to drop empty columns from your generated jsons therefore, i checked if val != np.nan. but u cant directly do it, instead you need to check val == val or not, because np.nan != itself. my version:

def to_formatted_json(df, sep="."):
result = []
for _, row in df.iterrows():
    parsed_row = {}
    for idx, val in row.iteritems():
        if val == val:
            keys = idx.split(sep)
            parsed_row = set_for_keys(parsed_row, keys, val)

    result.append(parsed_row)
return result


来源:https://stackoverflow.com/questions/54776916/inverse-of-pandas-json-normalize

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!