Python Pandas : pandas.to_datetime() is switching day & month when day is less than 13

前端 未结 3 1802
别那么骄傲
别那么骄傲 2020-12-20 16:39

I wrote a code that reads multiple files, however on some of my files datetime swaps day & month whenever the day is less than 13, and any day that is from day 13 or abo

相关标签:
3条回答
  • Well I solved my problem but in a memory consuming method, I split my tmp column first to a date and time columns then I re-split my date column to day month and year, that way I could look for the days that are less than 13 and replace them with the correspondent month

    df['tmp'] = pd.to_datetime(df['tmp'], unit='ns')
    df['tmp'] = df['tmp'].apply(lambda x: x.replace(microsecond=0))
    df['date'] = [d.date() for d in df['tmp']]
    df['time'] = [d.time() for d in df['tmp']]
    df[['year','month','day']] = df['date'].apply(lambda x: pd.Series(x.strftime("%Y-%m-%d").split("-")))
    
    df['day'] = pd.to_numeric(df['day'], errors='coerce')
    df['month'] = pd.to_numeric(df['month'], errors='coerce')
    df['year'] = pd.to_numeric(df['year'], errors='coerce')
    
    
    #Loop to look for days less than 13 and then swap the day and month
    for index, d in enumerate(df['day']):
            if(d <13): 
     df.loc[index,'day'],df.loc[index,'month']=df.loc[index,'month'],df.loc[index,'day'] 
    

    # convert series to string type in order to merge them

     df['day'] = df['day'].astype(str)
     df['month'] = df['month'].astype(str)
     df['year'] = df['year'].astype(str)
     df['date']=  pd.to_datetime(df[['year', 'month', 'day']])
     df['date'] = df['date'].astype(str)
     df['time'] = df['time'].astype(str)
    

    # merge time and date and place result in our column

    df['tmp'] =pd.to_datetime(df['date']+ ' '+df['time'])
    

    # drop the added columns

    df.drop(df[['date','year', 'month', 'day','time']], axis=1, inplace = True)
    
    0 讨论(0)
  • 2020-12-20 17:17

    You can use the dayfirst parameter in pd.to_datetime.

    pd.to_datetime(df.tmp, dayfirst=True)
    

    Output:

    0   2015-06-11 00:56:55
    1   2015-06-11 04:16:38
    2   2015-06-12 16:13:30
    3   2015-06-12 21:24:03
    4   2015-06-13 02:31:44
    5   2015-06-13 02:37:49
    Name: tmp, dtype: datetime64[ns]
    
    0 讨论(0)
  • 2020-12-20 17:21

    I ran into the same issue. In my case the dates were the index column (called "Date"). The above mentioned solution using to_datetime() directly on the dataframe with index column "Date" didn't work for me. I had to use read_csv() first without setting the index to "Date", then apply to_datetime() on it and only then set the index to "Date".

    df= pd.read_csv(file, parse_dates=True)
    df.Date = pd.to_datetime(df.Date, dayfirst=True)
    df = df.set_index('Date')
    
    0 讨论(0)
提交回复
热议问题