Comparing two pandas dataframes for differences

后端 未结 8 1238
感情败类
感情败类 2020-11-30 02:53

I\'ve got a script updating 5-10 columns worth of data , but sometimes the start csv will be identical to the end csv so instead of writing an identical csvfile I want it to

8条回答
  •  遥遥无期
    2020-11-30 03:24

    A more accurate comparison should check for index names separately, because DataFrame.equals does not test for that. All the other properties (index values (single/multiindex), values, columns, dtypes) are checked by it correctly.

    df1 = pd.DataFrame([[1, 'a'], [2, 'b'], [3, 'c']], columns=['num', 'name'])
    df1 = df1.set_index('name')
    df2 = pd.DataFrame([[1, 'a'], [2, 'b'], [3, 'c']], columns=['num', 'another_name'])
    df2 = df2.set_index('another_name')
    
    df1.equals(df2)
    True
    
    df1.index.names == df2.index.names
    False
    

    Note: using index.names instead of index.name makes it work for multi-indexed dataframes as well.

提交回复
热议问题