How to exclude date in Pandas Dataframe if not “end of month”

☆樱花仙子☆ 提交于 2021-01-27 12:19:32

问题


I have the following dataset:

import datetime
import pandas as pd

df = pd.DataFrame({'PORTFOLIO': ['A', 'A', 'A', 'A','A', 'A', 'A', 'A','A', 'A','A', 'A', 'A', 'A'],
               'DATE': ['28-02-2018','31-03-2018','30-04-2018','31-05-2018','30-06-2018','31-07-2018','31-08-2018',
                        '30-09-2018','31-10-2018','30-11-2018','31-12-2018','31-01-2019','28-02-2019','05-03-2019'],
               'IRR': [.7, .8, .9, .4, .2, .3, .4, .9, .7, .8, .9, .4,.7, .8],
               })
df

   PORTFOLIO       DATE  IRR
0          A 2018-02-28  0.7
1          A 2018-03-31  0.8
2          A 2018-04-30  0.9
3          A 2018-05-31  0.4
4          A 2018-06-30  0.2
5          A 2018-07-31  0.3
6          A 2018-08-31  0.4
7          A 2018-09-30  0.9
8          A 2018-10-31  0.7
9          A 2018-11-30  0.8
10         A 2018-12-31  0.9
11         A 2019-01-31  0.4
12         A 2019-02-28  0.7
13         A 2019-05-03  0.8

s you might see, all the dates are "end of month", except for 05-03-2019. What I need is to drop a DATE-value if its not "end of month".

My poor temperary solution is

df2=df[df.TODATE < '2019-03-01']

which is not good as the code should be more general.

How do I do that?


回答1:


This can be done in a one-liner: use pandas.Series.dt.is_month_end

df[pd.to_datetime(df["DATE"]).dt.is_month_end]

will give you your result.




回答2:


You can use pandas.tseries.offsets.MonthEnd in order to compare the current dates with the end of month dates, and perform a boolean indexation on the dataframe to keep only those that satisfy the condition:

from pandas.tseries.offsets import MonthEnd
df.DATE = pd.to_datetime(df.DATE)

df[df.DATE == df.DATE + MonthEnd(0)]

    PORTFOLIO   DATE  IRR
0          A 2018-02-28  0.7
1          A 2018-03-31  0.8
2          A 2018-04-30  0.9
3          A 2018-05-31  0.4
4          A 2018-06-30  0.2
5          A 2018-07-31  0.3
6          A 2018-08-31  0.4
7          A 2018-09-30  0.9
8          A 2018-10-31  0.7
9          A 2018-11-30  0.8
10         A 2018-12-31  0.9
11         A 2019-01-31  0.4
12         A 2019-02-28  0.7



回答3:


I am putting this to expand on @Christian Sloper's answer. I find it easier to reference, if the answer is self contained and I think it will help others.

I created a new column called MonthEnd and used a filter to get only those that are not month end.

import datetime
import pandas as pd

df = pd.DataFrame({'PORTFOLIO': ['A', 'A', 'A', 'A','A', 'A', 'A', 'A','A', 'A','A', 'A', 'A', 'A'],
               'DATE': ['28-02-2018','31-03-2018','30-04-2018','31-05-2018','30-06-2018','31-07-2018','31-08-2018',
                        '30-09-2018','31-10-2018','30-11-2018','31-12-2018','31-01-2019','28-02-2019','05-03-2019'],
               'IRR': [.7, .8, .9, .4, .2, .3, .4, .9, .7, .8, .9, .4,.7, .8],
               })
#new column called MonthEnd 
df['MonthEnd'] =  pd.to_datetime(df['DATE']).dt.is_month_end
#filter to get only those that are not month end
df[~df["MonthEnd"]]

dataframe:

DATE    IRR PORTFOLIO   MonthEnd
0   28-02-2018  0.7 A   True
1   31-03-2018  0.8 A   True
2   30-04-2018  0.9 A   True
3   31-05-2018  0.4 A   True
4   30-06-2018  0.2 A   True
5   31-07-2018  0.3 A   True
6   31-08-2018  0.4 A   True
7   30-09-2018  0.9 A   True
8   31-10-2018  0.7 A   True
9   30-11-2018  0.8 A   True
10  31-12-2018  0.9 A   True
11  31-01-2019  0.4 A   True
12  28-02-2019  0.7 A   True
13  05-03-2019  0.8 A   False

After Filter:

DATE    IRR PORTFOLIO   MonthEnd
13  05-03-2019  0.8 A   False


来源:https://stackoverflow.com/questions/55040850/how-to-exclude-date-in-pandas-dataframe-if-not-end-of-month

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!