Python pandas df.copy() ist not deep

蹲街弑〆低调 提交于 2021-02-11 15:47:30

问题


I have (in my opinion) a strange problem with python pandas. If I do:

cc1 = cc.copy(deep=True)

for the dataframe cc and than ask a certain row and column:

print(cc1.loc['myindex']['data'] is cc.loc['myindex']['data'])

I get

True

What's wrong here?


回答1:


Deep copying doesn't work in pandas and the devs consider putting mutable objects inside a DataFrame as an antipattern

There is nothing wrong in your code, just in case if you want to know the difference with some example of deep and shallow copy() here it is.

Deep copy

dict_1= {'Column A': ['House','Animal', 'car'],
     'Column B': ["walls,doors,rooms", "Legs,nose,eyes", "tires,engine" ]}

df1 = pd.DataFrame(dict_1, columns=['Column A', 'Column B'])

# Deep copy
df2 = df1.copy()  #  deep=True by default
df2 == df1  # it returns True because no updates has happened on either of dfs
output
#   Column A    Column B
# 0 True    True
# 1 True    True
# 2 True    True

id(df1)  # output: 2302063108040
id(df2)  # ouptut: 2302063137224

Now if you update Column B of df1

dict_new =  {'Column A': ['House','Animal', 'car'],
     'Column B': ["walls", "Legs,nose,eyes,tail", "tires,engine,bonnet" ]}

# updating only column B values
df1.update(dict_new)

df1 == df2   # it returns false for the values which got changed

output:

    Column A    Column B
0   True    False
1   True    False
2   True    False

And if we see df1 # which is deeply copied it remains unchanged

df1
# output:
# Column A  Column B
# 0 House   walls,doors,rooms
# 1 Animal  Legs,nose,eyes
# 2 car tires,engine

Shallow copy

df2 = df1.copy(deep=False)  #  deep=True by default hence explicitly providing argument to False
df2 == df1  # it returns True because no updates has happened on either of dfs
# output
#   Column A    Column B
# 0 True    True
# 1 True    True
# 2 True    True

dict_new =  {'Column A': ['House','Animal', 'car'],
     'Column B': ["walls", "Legs,nose,eyes,tail", "tires,engine,bonnet" ]}

df1.update(dict_new)

df2 == df1  # since it has same reference of d1 you will see all true even after updating column B unlike deep copy
# output
#   Column A    Column B
# 0 True    True
# 1 True    True
# 2 True    True

df2  # now if you see df2 it has all those updated values of df1

# output:
#   Column A    Column B
# 0 House   walls
# 1 Animal  Legs,nose,eyes,tail
# 2 car tires,engine,bonnet

Source: python Pandas DataFrame copy(deep=False) vs copy(deep=True) vs '=' https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.copy.html



来源:https://stackoverflow.com/questions/61578453/python-pandas-df-copy-ist-not-deep

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!