I have a DF in which I have bookingDt
and arrivalDt
columns. I need to find all the dates between these two dates.
Sample code:>
Well, you can do following.
Create a dataframe with dates only:
dates_df
# with all days between first bookingDt
and last arrivalDt
and then join those df with between condition:
df.join(dates_df,
on=col('dates_df.dates').between(col('df.bookindDt'), col('dt.arrivalDt'))
.select('df.*', 'dates_df.dates')
It might work even faster then solution with explode
, however you need to figure out what is start and end date for this df.
10 years df will have just 3650 records not that many to worry about.