TypeError using timedelta, cannot sum times

▼魔方 西西 提交于 2021-02-11 04:55:25


I have data that looks like this:

    user                in               out location  flag     Time
0    ron  12/21/2021 10:11  12/21/2016 17:50     home     0  4:19:03
1    ron  12/21/2016 13:26  12/21/2016 13:52   office     2  0:25:28
2  april   12/21/2016 8:12  12/21/2016 17:27   office     0  8:15:03
3  april  12/21/2016 18:54  12/21/2016 22:56   office     0  4:02:36
4   andy   12/21/2016 8:57  12/21/2016 12:15     home     0  2:59:40

I want to sum or take the max value of time per user based on the flag. So I converted the column to timedelta.

sample.loc[:,'Time'] = pd.to_timedelta(sample['Time'])

However, when I try to test this by summing the entire column


I get the following error:

TypeError: unsupported operand type(s) for +: 'int' and 'Timedelta'

What am I missing here? I thought you could sum with Timedelta.


Python's sum, by default, assumes you are summing integers. Hence it tries to start summing from 0, which is where this error comes from. It's impossible to add 0 to a timedelta.

This can be fixed in 2 ways:

  • Provide a different starting value to sum, perhaps an "empty" timedelta, as the second argument for sum:

    from datetime import timedelta
    sum(sample['Time'], timedelta())
  • Use Series.sum (which will probably have better performance anyway):



In order to take the sum of the number of days that you received using pd.to_timedelta(), you need to do the following:


That is, you need to convert the 'Time' column integer to perform the sum() operation. 8.64e+13 is to convert the ns to days.

