Python Pandas check that string is only “Date” or only “Time” or “Datetime”

只愿长相守 提交于 2020-07-21 04:09:19

问题


I am reading a csv using pandas

str,date,float,time,datetime
a,10/11/19,1.1,10:30:00,10/11/19 10:30
b,10/11/19,1.2,10:00:00,10/11/19 10:30
c,10/11/19,1.3,11:10:11,10/11/19 10:30
df = pd.read_csv(file)

Now my business requirement is that I want to tell which column is pure date field, pure time field, or which is complete datetime. For particular column my code is:

try:
                    dt = pd.to_datetime(df[col])
                    dates = [obj.date() for obj in dt]
                    times = [obj.time() for obj in dt]

                    if dates and (set(times) == set([datetime.time(0, 0)])):
                        # Its a pure date field
                    elif <something>:
                       # Its a  pure time field
                    else:
                       #Its a Datetime field


except:
            # its not a datefield

problem with my code is when there is only time field, pd.to_datetime is taking default today's date so I am not able to differentiate it with datetime. Is there any easy solution? Please help me fill "something" in code above


回答1:


If want test times, pandas by default use today dates, so possible solution is test them with Series.dt.date, Timestamp.date and Series.all if all values of column match.

Also added another solution for test dates - test if same values after removed times by Series.dt.floor:

df = pd.DataFrame({'a':['2019-01-01 12:23:10',
                        '2019-01-02 12:23:10'],
                   'b':['2019-01-01',
                        '2019-01-02'],
                   'c':['12:23:10',
                        '15:23:10'],
                   'd':['a','b']})
print (df)
                     a           b         c  d
0  2019-01-01 12:23:10  2019-01-01  12:23:10  a
1  2019-01-02 12:23:10  2019-01-02  15:23:10  b

def check(col):
    try:
        dt = pd.to_datetime(df[col])

        if (dt.dt.floor('d') == dt).all():
            return ('Its a pure date field')
        elif (dt.dt.date == pd.Timestamp('now').date()).all():
            return ('Its a pure time field')
        else:
            return ('Its a Datetime field') 
    except:
        return ('its not a datefield')


print (check('a'))
print (check('b'))
print (check('c'))
print (check('d'))
Its a Datetime field
Its a pure date field
Its a pure time field
its not a datefield

Another idea is also test if numeric columns and by default return not numeric for prevent casting numeric to datetimes, but if possible all datetimes contains only todays dates (f column) then test for times is different with Series.str.contains for match pattern HH:MM:SS or H:MM:SS:

df = pd.DataFrame({'a':['2019-01-01 12:23:10',
                        '2019-01-02'],
                   'b':['2019-01-01',
                        '2019-01-02'],
                   'c':['12:23:10',
                        '15:23:10'],
                   'd':['a','b'],
                   'e':[1,2],
                  'f':['2019-11-13 12:23:10',
                       '2019-11-13'],})
print (df)
                     a           b         c  d  e                    f
0  2019-01-01 12:23:10  2019-01-01  12:23:10  a  1  2019-11-13 12:23:10
1           2019-01-02  2019-01-02  15:23:10  b  2           2019-11-13

def check(col):
    if np.issubdtype(df[col].dtype, np.number):
        return ('its not a datefield')

    try:
        dt = pd.to_datetime(df[col])
        if (dt.dt.floor('d') == dt).all():
            return ('Its a pure date field')
        elif df[col].str.contains(r"^\d{1,2}:\d{2}:\d{2}$").all():
            return ('Its a pure time field')
        else:
            return ('Its a Datetime field') 
    except:
        return ('its not a datefield')


print (check('a'))
print (check('b'))
print (check('c'))
print (check('d'))
print (check('e'))
print (check('f'))
Its a Datetime field
Its a pure date field
Its a pure time field
its not a datefield
its not a datefield
Its a Datetime field


来源:https://stackoverflow.com/questions/58831943/python-pandas-check-that-string-is-only-date-or-only-time-or-datetime

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!