Deep copy of Pandas dataframes and dictionaries

前端 未结 1 1718
春和景丽
春和景丽 2020-12-18 13:47

I\'m creating a small Pandas dataframe:

df = pd.DataFrame(data={\'colA\': [[\"a\", \"b\", \"c\"]]})

I take a deepcopy of that df. I\'m not

相关标签:
1条回答
  • 2020-12-18 14:26

    Disclaimer


    Notice that putting mutable objects inside a DataFrame can be an antipattern so make sure that you really need it and you understand what you are doing.

    Why doesn't your copy independent


    When applied on an object, copy.deepcopy is looked up for a _deepcopy_ method of that object, that is called in turn. It's added to avoid copying too much for objects. In the case of a DataFrame instance in version 0.20.0 and above - _deepcopy_ doesn`t work recursively.

    Similarly, if you will use DataFrame.copy(deep=True) deep copy will copy the data, but will not do so recursively. .

    How to solve the problem


    To take a truly deep copy of a DataFrame containing a list(or other python objects), so that it will be independent - you can use one of the methods below.

    df_copy = pd.DataFrame(columns = df.columns, data = copy.deepcopy(df.values))
    

    For a dictionary, you may use same trick:

    mydict = pd.DataFrame(columns = df.columns, data = copy.deepcopy(df_copy.values)).to_dict()
    mydict['colA'][0].remove("b")
    

    There's also a standard hacky way of deep-copying python objects:

    import pickle
    df_copy = pickle.loads(pickle.dumps(df))  
    

    Hope I've answered your question. Feel free to ask for any clarifications, if needed.

    0 讨论(0)
提交回复
热议问题