I have the following dataframe:
user_id purchase_date
1 2015-01-23 14:05:21
2 2015-02-05 05:07:30
3 2015-02-18 17:08:51
4
Most proposed solutions don't work for the first day of the month.
Following solution works for any day of the month:
df['month'] = df['purchase_date'] + pd.offsets.MonthEnd(0) - pd.offsets.MonthBegin(normalize=True)
[EDIT]
Another, more readable, solution is:
from pandas.tseries.offsets import MonthBegin
df['month'] = df['purchase_date'].dt.normalize().map(MonthBegin().rollback)
Be aware not to use:
df['month'] = df['purchase_date'].map(MonthBegin(normalize=True).rollback)
because that gives incorrect results for the first day due to a bug: https://github.com/pandas-dev/pandas/issues/32616