Finding start time and end time in a column

天大地大妈咪最大 提交于 2021-02-10 07:06:12

问题


I have a data set that has employees clocking in and out. It looks like this (note two entries per employee):

Employee    Date   Time
Emp1       1/1/16  06:00
Emp1       1/1/16  13:00
Emp2       1/1/16  09:00
Emp2       1/1/16  17:00
Emp3       1/1/16  11:00
Emp3       1/1/16  18:00

I want to get the data to look like this:

Employee   Date   Start   End
Emp1       1/1/16 06:00   13:00
Emp2       1/1/16 09:00   17:00
Emp3       1/1/16 11:00   18:00

I would like to get it into a data frame format so that I can do some calculations.

I currently have tried

df['start'] = np.where((df['employee']==df['employee']&df['date']==df['date']),df['time'].min())

I also tried:

df.groupby(['employee','date]['time'].max()

How do I get two columns out of one?


回答1:


I would recommend to merge Date and Time into one column as DateTime. That would greatly simplify your work. You can do something like this:

df['DateTime']=pd.to_datetime(df['Date']+" "+df['Time'])
df.groupby('Employee')['DateTime'].agg([min, max])

There are other options depending the content of your data. If you know that all the entries will be on the same day you can simply do:

# First convert Date and Time columns to DateTime type
df['Date'] = pd.to_datetime(df['Date']).dt.date
df['Time'] = pd.to_datetime(df['Time']).dt.time
df.groupby('Employee').agg([min, max])

no need to create a DateTime column in this case.

If you want to know Start End times per each day you can do:

# First convert Date and Time columns to DateTime type
df['Date'] = pd.to_datetime(df['Date']).dt.date
df['Time'] = pd.to_datetime(df['Time']).dt.time
df.groupby(['Employee','Date'])['Time'].agg([min, max])


来源:https://stackoverflow.com/questions/40747795/finding-start-time-and-end-time-in-a-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!