I have a pandas dataframe containing a record of lightning strikes with timestamps and global positions in the following format:
Index Date Time
Depending on your data, this might be useful or not. Some strikes may be "isolated" in time, i.e. further away from the strike before and the strike after than the time-threshold. You could use these strikes to separate your data into groups, and you can then process those groups using searchsorted
along the lines suggested by ysearka. If your data ends up separated into hundreds of groups, it might save time.
Here is how the code would look like:
# first of all, convert to timestamp
df['DateTime'] = pd.to_datetime(df['Date'].astype(str) + 'T' + df['Time'])
# calculate the time difference with previous and following strike
df['time_separation'] = np.minimum( df['DateTime'].diff().values,
-df['DateTime'].diff(-1).values)
# using a specific threshold for illustration
df['is_isolated'] = df['time_separation'] > "00:00:00.08"
# define groups
df['group'] = (df['is_isolated'] != df['is_isolated'].shift()).cumsum()
# put isolated strikes into a separate group so they can be skipped
df.loc[df['is_isolated'], 'group'] = -1
Here is the output, with the specific threshold I used:
Lat Lon DateTime is_isolated group
0 -7.1961 -60.7604 2016-01-01 00:00:00.996269200 False 1
1 -7.0518 -60.6911 2016-01-01 00:00:01.064620700 False 1
2 -25.3913 -57.2922 2016-01-01 00:00:01.110206600 False 1
3 -7.4842 -60.5129 2016-01-01 00:00:01.201857300 True -1
4 -7.3939 -60.4992 2016-01-01 00:00:01.294275000 True -1
5 -9.6386 -62.8448 2016-01-01 00:00:01.443149300 False 3
6 -23.7089 -58.8888 2016-01-01 00:00:01.522615700 False 3
7 -6.3513 -55.6545 2016-01-01 00:00:01.593241200 False 3
8 -23.8019 -58.9382 2016-01-01 00:00:01.673635000 False 3
9 -24.5724 -57.7229 2016-01-01 00:00:01.695785800 False 3