Given this base date:
base_date = \"10/29 06:58 AM\"
I want to find a tuple within the list that contains the closest date to the bas
I was looking up this problem and found some answers, most of which check all elements. I have my dates sorted (and assume most people do), so if you do as well, use numpy:
import numpy as np
// dates is a numpy array of np.datetime64 objects
dates = np.array([date1, date2, date3, ...], dtype=np.datetime64)
timestamp = np.datetime64('Your date')
np.searchsorted(dates, timestamp)
searchsorted uses binary search, which uses the fact the dates are sorted, and is thus very efficient. If you use pandas, this is possible:
dates = df.index # df is a DatetimeIndex-ed dataframe
timestamp = pd.to_datetime('your date here', format='its format')
np.searchsorted(dates, timestamp)
The function returns the index of the closest date (if the searched date is included in dates, its index is returned [if that isn't wanted, use side='right' as an argument into the function]), so to get the date do this:
dates[np.searchsorted(dates, timestamp)]
import datetime
fmt = '%m/%d %H:%M %p'
d = datetime.datetime.strptime(base_date, fmt)
def foo(x):
return (datetime.datetime.strptime(x[0],fmt)-d).total_seconds() > 0
sorted(list_date, key=foo)[-1]
You can consider putting the dates list into a Pandas index and then use 'truncate' or 'get_loc' function.
import pandas as pd
##Initial inputs
list_date = [('10/30 02:18 PM', '-103', '-107'),('10/29 02:15 AM', '-101', '-109') , ('10/30 02:17 PM', '+100', '-110'), \
] # reordered to show the method is input order insensitive
base_date = "10/29 06:58 AM"
##Make a data frame with data
df=pd.DataFrame(list_date)
df.columns=['date','val1','val2']
dateIndex=pd.to_datetime(df['date'], format='%m/%d %I:%M %p')
df=df.set_index(dateIndex)
df=df.sort_index(ascending=False) #earliest comes on top
##Find the result
base_dateObj=pd.to_datetime(base_date, format='%m/%d %I:%M %p')
result=df.truncate(after=base_dateObj).iloc[-1] #take the bottom value, or the 1st after the base date
(result['date'],result['val1'], result['val2']) # result is ('10/30 02:17 PM', '+100', '-110')
Reference: this link
import time
import sys
#The Function
def to_sec(date_string):
return time.mktime(time.strptime(date_string, '%m/%d %I:%M %p'))
#The Test
base_date = "10/29 06:58 AM"
base_date_sec = to_sec(base_date)
result = None
difference = sys.maxint
list_date = [
('10/30 02:18 PM', '-103', '-107'),
('10/30 02:17 PM', '+100', '-110'),
('10/29 02:15 AM', '-101', '-109') ]
for date_str in list_date:
diff_sec = to_sec(date_str[0])-base_date_sec
if diff_sec >= 0 and diff_sec < difference:
result = date_str
difference = diff_sec
print result
Linear search?
import sys
import time
base_date = "10/29 06:58 AM"
def str_to_my_time(my_str):
return time.mktime(time.strptime(my_str, "%m/%d %I:%M %p"))
# assume year 1900...
base_dt = str_to_my_time(base_date)
list_date = [('10/30 02:18 PM', '-103', '-107'),
('10/30 02:17 PM', '+100', '-110'),
('10/29 02:15 AM', '-101', '-109')]
best_delta = sys.maxint
best_match = None
for t in list_date:
the_dt = str_to_my_time(t[0])
delta_sec = the_dt - base_dt
if (delta_sec >= 0) and (delta_sec < best_delta):
best_delta = delta_sec
best_match = t
print best_match, best_delta
Producing:
('10/30 02:17 PM', '+100', '-110') 112740.0
decorate, filter, find the closest date, undecorate
>>> base_date = "10/29 06:58 AM"
>>> list_date = [
... ('10/30 02:18 PM', '-103', '-107'),
... ('10/30 02:17 PM', '+100', '-110'),
... ('10/29 02:15 AM', '-101', '-109')
... ]
>>> import datetime
>>> fmt = '%m/%d %H:%M %p'
>>> base_d = datetime.datetime.strptime(base_date, fmt)
>>> candidates = ((datetime.datetime.strptime(d, fmt), d, x, y) for d, x, y in list_date)
>>> candidates = min((dt, d, x, y) for dt, d, x, y in candidates if dt > base_d)
>>> print candidates[1:]
('10/30 02:17 PM', '+100', '-110')