You are going on a one-way indirect flight trip that includes billions an unknown very large number of transfers.
First of all, create some kind of subtrip structure that contains a part of your route.
For example, if your complete trip is a-b-c-d-e-f-g
, a subtrip could be b-c-d
, i.e. a connected subpath of your trip.
Now, create two hashtables that map a city to the subtrip structure the city is contained in. Thereby, one Hashtable stands for the city a subtrip is starting with, the other stands for the cities a subtrip is ending with. That means, one city can occur at most once in one of the hashtables.
As we will see later, not every city needs to be stored, but only the beginning and the end of each subtrip.
Now, take the tickets just one after another. We assume the ticket to go from x
to y
(represented by (x,y)
). Check, wheter x
is the end of some subtrip s
(since every city is visited only once, it can not be the end of another subtrip already). If x
is the beginning, just add the current ticket (x,y)
at the end of the subtrip s
. If there is no subtrip ending with x
, check whether there is a subtrip t
beginning with y
. If so, add (x,y)
at the beginning of t
. If there's also no such subtrip t
, just create a new subtrip containing just (x,y)
.
Dealing with subtrips should be done using some special "tricks".
s
containing (x,y)
should add x
to the hashtable for "subtrip beginning cities" and add y
to the hashtable for "subtrip ending cities".(x,y)
at the beginning of the subtrip s=(y,...)
, should remove y
from the hashtable of beginning cities and instead add x
to the hashtable of beginning cities.(x,y)
at the end of the subtrip s=(...,x)
, should remove x
from the hashtable of ending cities and instead add y
to the hashtable of ending cities.With this structure, subtrips corresponding to a city can be done in amortized O(1).
After this is done for all tickets, we have some subtrips. Note the fact that we have at most (n-1)/2 = O(n)
such subtrips after the procedure.
Now, we just consider the subtrips one after another. If we have a subtrip s=(x,...,y)
, we just look in our hashtable of ending cities, if there's a subtrip t=(...,x)
ending with x
. If so, we concatenate t
and s
to a new subtrip. If not, we know, that s
is our first subtrip; then, we look, if there's another subtrip u=(y,...)
beginning with y
. If so, we concatenate s
and u
. We do this until just one subtrip is left (this subtrip is then our whole original trip).
I hope I didnt overlook somtehing, but this algorithm should run in:
O(n)
) can be done in O(n)
, if we implement adding tickets to a subtrip in O(1)
. This should be no problem, if we have some nice pointer structure or something like that (implementing subtrips as linked lists). Also changing two values in the hashtable is (amortized) O(1)
. Thus, this phase consumes O(n)
time.O(n)
. Too see this, we just need to look at what is done in the second phase: Hashtable lookups, that need amortized O(1)
and subtrip concatenation that can be done in O(1)
with pointer concatenation or something.Thus, the whole algorithm takes time O(n)
, which might be the optimal O
-bound, since at least every ticket might need to be looked at.