问题
I have two dataframes consisting a similar type of informatio. I'm attempting to merge them toghether and reorganize them. Here is a sample of the dataframes:
df1 =
Member Nbr Name-First Name-Last Date-Join
20 Zoe Soumas 2011-08-01
3128 Julien Bougie 2011-07-22
3535 Michel Bibeau 2015-02-18
4116 Christopher Duthie 2014-12-02
4700 Manoj Chauhan 2014-11-11
4802 Anna Balian 2014-07-26
5004 Abdullah Cekic 2012-03-12
5130 Raymonde Girard 2011-01-04
df2 =
Member Nbr Name-First Name-Last Date-Join
3762 Robert Ortopan 2010-01-31
3762 Robert Ortopan 2010-02-28
3892 Christian Burnet 2010-03-24
3892 Christian Burnet 2010-04-24
5022 Robert Ngabirano 2010-06-25
5022 Robert Ngabirano 2010-07-28
what I would like to have is a dataframe that is sorted by Member Nbr
, where if the member appears more than once then it will orgonized again by join date. So I would have:
df12 =
Member Nbr Name-First Name-Last Date-Join
20 Zoe Soumas 2011-08-01
3128 Julien Bougie 2011-07-22
3535 Michel Bibeau 2015-02-18
3762 Robert Ortopan 2010-01-31
3762 Robert Ortopan 2010-02-28
3892 Christian Burnet 2010-03-24
3892 Christian Burnet 2010-04-24
4116 Christopher Duthie 2014-12-02
4700 Manoj Chauhan 2014-11-11
4802 Anna Balian 2014-07-26
5004 Abdullah Cekic 2012-03-12
5022 Robert Ngabirano 2010-06-25
5022 Robert Ngabirano 2010-07-28
5130 Raymonde Girard 2011-01-04
I've manage to concatonate both data frames using df12 = pd.concat([df1, df2], ignore_index=True)
which place df2
at the bottom of df1
. After using
df12.sort_values(by='Member Nbr', axis=0, inplace=True)
The members are arraange in ascending order, but those that appear more than once (at different join dates) are arrange descending order. That is
Member Nbr Name-First Name-Last Date-Join
20 Zoe Soumas 2011-08-01
3128 Julien Bougie 2011-07-22
3535 Michel Bibeau 2015-02-18
3762 Robert Ortopan 2010-02-28 # Wrongly sorted
3762 Robert Ortopan 2010-01-31
3892 Christian Burnet 2010-04-24 # Wrongly sorted
3892 Christian Burnet 2010-03-24
4116 Christopher Duthie 2014-12-02
4700 Manoj Chauhan 2014-11-11
4802 Anna Balian 2014-07-26
5004 Abdullah Cekic 2012-03-12
5022 Robert Ngabirano 2010-07-28 # Wrongly sorted
5022 Robert Ngabirano 2010-06-25
5130 Raymonde Girard 2011-01-04
Is there a way to have those members with more than one join date also be arranged in ascending order by date?
回答1:
by
parameter can be a list of columns so that the dataframe is first sorted by the first column (and for ties by the second column, and for ties by the third column etc.)
df12.sort_values(by=['Member Nbr', 'Date-Join'], inplace=True)
produces
Member Nbr Name-First Name-Last Date-Join
0 20 Zoe Soumas 2011-08-01
1 3128 Julien Bougie 2011-07-22
2 3535 Michel Bibeau 2015-02-18
4 3762 Robert Ortopan 2010-01-31
3 3762 Robert Ortopan 2010-02-28
6 3892 Christian Burnet 2010-03-24
5 3892 Christian Burnet 2010-04-24
7 4116 Christopher Duthie 2014-12-02
8 4700 Manoj Chauhan 2014-11-11
9 4802 Anna Balian 2014-07-26
10 5004 Abdullah Cekic 2012-03-12
12 5022 Robert Ngabirano 2010-06-25
11 5022 Robert Ngabirano 2010-07-28
13 5130 Raymonde Girard 2011-01-04
Note that for this to work correctly, Date-Join column should be of type datetime.
来源:https://stackoverflow.com/questions/38299831/pandas-using-sort-values-to-sort-2-dataframes-then-sub-sort-by-date