Pandas concat yields ValueError: Plan shapes are not aligned

后端 未结 6 704
忘掉有多难
忘掉有多难 2020-12-03 02:23

In pandas, I am attempting to concatenate a set of dataframes and I am getting this error:

ValueError: Plan shapes are not aligned

My underst

6条回答
  •  春和景丽
    2020-12-03 03:16

    Wrote a small function to concatenate duplicated column names. Function cares about sorting if original dataframe is unsorted, the output will be a sorted one.

    def concat_duplicate_columns(df):
        dupli = {}
        # populate dictionary with column names and count for duplicates 
        for column in df.columns:
            dupli[column] = dupli[column] + 1 if column in dupli.keys() else 1
        # rename duplicated keys with °°° number suffix
        for key, val in dict(dupli).items():
            del dupli[key]
            if val > 1:
                for i in range(val):
                    dupli[key+'°°°'+str(i)] = val
            else: dupli[key] = 1
        # rename columns so that we can now access abmigous column names
        # sorting in dict is the same as in original table
        df.columns = dupli.keys()
        # for each duplicated column name
        for i in set(re.sub('°°°(.*)','',j) for j in dupli.keys() if '°°°' in j):
            i = str(i)
            # for each duplicate of a column name
            for k in range(dupli[i+'°°°0']-1):
                # concatenate values in duplicated columns
                df[i+'°°°0'] = df[i+'°°°0'].astype(str) + df[i+'°°°'+str(k+1)].astype(str)
                # Drop duplicated columns from which we have aquired data
                df = df.drop(i+'°°°'+str(k+1), 1)
        # resort column names for proper mapping
        df = df.reindex_axis(sorted(df.columns), axis = 1)
        # rename columns
        df.columns = sorted(set(re.sub('°°°(.*)','',i) for i in dupli.keys()))
        return df
    

提交回复
热议问题