One-way flight trip problem

后端未结

关注

 18  1454

不要未来只要你来

You are going on a one-way indirect flight trip that includes ~~billions~~ an unknown very large number of transfers.

You are not stoppi

相关标签:

18条回答

一生所求

2020-12-12 17:47
Prerequisites

First of all, create some kind of subtrip structure that contains a part of your route.

For example, if your complete trip is a-b-c-d-e-f-g, a subtrip could be b-c-d, i.e. a connected subpath of your trip.

Now, create two hashtables that map a city to the subtrip structure the city is contained in. Thereby, one Hashtable stands for the city a subtrip is starting with, the other stands for the cities a subtrip is ending with. That means, one city can occur at most once in one of the hashtables.

As we will see later, not every city needs to be stored, but only the beginning and the end of each subtrip.

Constructing subtrips

Now, take the tickets just one after another. We assume the ticket to go from x to y (represented by (x,y)). Check, wheter x is the end of some subtrip s(since every city is visited only once, it can not be the end of another subtrip already). If x is the beginning, just add the current ticket (x,y) at the end of the subtrip s. If there is no subtrip ending with x, check whether there is a subtrip t beginning with y. If so, add (x,y) at the beginning of t. If there's also no such subtrip t, just create a new subtrip containing just (x,y).

Dealing with subtrips should be done using some special "tricks".
- Creating a new subtrip s containing (x,y) should add x to the hashtable for "subtrip beginning cities" and add y to the hashtable for "subtrip ending cities".
- Adding a new ticket (x,y) at the beginning of the subtrip s=(y,...), should remove y from the hashtable of beginning cities and instead add x to the hashtable of beginning cities.
- Adding a new ticket (x,y) at the end of the subtrip s=(...,x), should remove x from the hashtable of ending cities and instead add y to the hashtable of ending cities.
With this structure, subtrips corresponding to a city can be done in amortized O(1).

After this is done for all tickets, we have some subtrips. Note the fact that we have at most (n-1)/2 = O(n) such subtrips after the procedure.

Concatenating subtrips

Now, we just consider the subtrips one after another. If we have a subtrip s=(x,...,y), we just look in our hashtable of ending cities, if there's a subtrip t=(...,x) ending with x. If so, we concatenate t and s to a new subtrip. If not, we know, that s is our first subtrip; then, we look, if there's another subtrip u=(y,...) beginning with y. If so, we concatenate s and u. We do this until just one subtrip is left (this subtrip is then our whole original trip).

I hope I didnt overlook somtehing, but this algorithm should run in:
- constructing all subtrips (at most O(n)) can be done in O(n), if we implement adding tickets to a subtrip in O(1). This should be no problem, if we have some nice pointer structure or something like that (implementing subtrips as linked lists). Also changing two values in the hashtable is (amortized) O(1). Thus, this phase consumes O(n) time.
- concatenating the subtrips until just one is left can also be done in O(n). Too see this, we just need to look at what is done in the second phase: Hashtable lookups, that need amortized O(1) and subtrip concatenation that can be done in O(1) with pointer concatenation or something.
Thus, the whole algorithm takes time O(n), which might be the optimal O-bound, since at least every ticket might need to be looked at.
0 讨论(0)
发布评论:

提交评论
- 加载中...

無奈伤痛

2020-12-12 17:50

It's basically a dependency graph where every ticket represents a node and the src and dst airport represents directed links, so use a topological sort to determine the flight order.

EDIT: Although since this is an airline ticket and you know you actually made an itinerary you could physically perform, sort by departure date and time in UTC.

EDIT2: Assuming each airport you have a ticket to uses a three character code, you can use the algorithm described here (Find three numbers appeared only once) to determine the two unique airports by xoring all the airports together.

EDIT3: Here's some C++ to actually solve this problem using the xor method. The overall algorithm is as follows, assuming a unique encoding from airport to an integer (either assuming a three letter airport code or encoding the airport location in an integer using latitude and longitude):

First, XOR all the airport codes together. This should be equal to the initial source airport XOR the final destination airport. Since we know that the initial airport and the final airport are unique, this value should not be zero. Since it's not zero, there will be at least one bit set in that value. That bit corresponds to a bit that is set in one of the airports and not set in the other; call it the designator bit.

Next, set up two buckets, each with the XORed value from the first step. Now, for every ticket, bucket each airport according to whether it has the designator bit set or not, and xor the airport code with the value in the bucket. Also keep track for each bucket how many source airports and destination airports went to that bucket.

After you process all the tickets, pick one of the buckets. The number of source airports sent to that bucket should be one greater or less than the number of destination airports sent to that bucket. If the number of source airports is less than the number of destination airports, that means the initial source airport (the only unique source airport) was sent to the other bucket. That means the value in the current bucket is the identifier for the initial source airport! Conversely, if the number of destination airports is less than the number of source airports, the final destination airport was sent to the other bucket, so the current bucket is the identifier for the final destination airport!

struct ticket
{
    int src;
    int dst;
};

int get_airport_bucket_index(
    int airport_code, 
    int discriminating_bit)
{
    return (airport_code & discriminating_bit)==discriminating_bit ? 1 : 0;
}

void find_trip_endpoints(const ticket *tickets, size_t ticket_count, int *out_src, int *out_dst)
{
    int xor_residual= 0;

    for (const ticket *current_ticket= tickets, *end_ticket= tickets + ticket_count; current_ticket!=end_ticket; ++current_ticket)
    {
        xor_residual^= current_ticket->src;
        xor_residual^= current_ticket->dst;
    }

    // now xor_residual will be equal to the starting airport xor ending airport
    // since starting airport!=ending airport, they have at least one bit that is not in common
    // 

    int discriminating_bit= xor_residual & (-xor_residual);

    assert(discriminating_bit!=0);

    int airport_codes[2]= { xor_residual, xor_residual };
    int src_count[2]= { 0, 0 };
    int dst_count[2]= { 0, 0 };

    for (const ticket *current_ticket= tickets, *end_ticket= tickets + ticket_count; current_ticket!=end_ticket; ++current_ticket)
    {
        int src_index= get_airport_bucket_index(current_ticket->src, discriminating_bit);

        airport_codes[src_index]^= current_ticket->src;
        src_count[src_index]+= 1;

        int dst_index= get_airport_bucket_index(current_ticket->dst, discriminating_bit);
        airport_codes[dst_index]^= current_ticket->dst;
        dst_count[dst_index]+= 1;
    }

    assert((airport_codes[0]^airport_codes[1])==xor_residual);
    assert(abs(src_count[0]-dst_count[0])==1); // all airports with the bit set/unset will be accounted for as well as either the source or destination
    assert(abs(src_count[1]-dst_count[1])==1);
    assert((src_count[0]-dst_count[0])==-(src_count[1]-dst_count[1]));

    int src_index= src_count[0]-dst_count[0]<0 ? 0 : 1; 
    // if src < dst, that means we put more dst into the source bucket than dst, which means the initial source went into the other bucket, which means it should be equal to this bucket!

    assert(get_airport_bucket_index(airport_codes[src_index], discriminating_bit)!=src_index);

    *out_src= airport_codes[src_index];
    *out_dst= airport_codes[!src_index];

    return;
}

int main()
{
    ticket test0[]= { { 1, 2 } };
    ticket test1[]= { { 1, 2 }, { 2, 3 } };
    ticket test2[]= { { 1, 2 }, { 2, 3 }, { 3, 4 } };
    ticket test3[]= { { 2, 3 }, { 3, 4 }, { 1, 2 } };
    ticket test4[]= { { 2, 1 }, { 3, 2 }, { 4, 3 } };
    ticket test5[]= { { 1, 3 }, { 3, 5 }, { 5, 2 } };

    int initial_src, final_dst;

    find_trip_endpoints(test0, sizeof(test0)/sizeof(*test0), &initial_src, &final_dst);
    assert(initial_src==1);
    assert(final_dst==2);

    find_trip_endpoints(test1, sizeof(test1)/sizeof(*test1), &initial_src, &final_dst);
    assert(initial_src==1);
    assert(final_dst==3);

    find_trip_endpoints(test2, sizeof(test2)/sizeof(*test2), &initial_src, &final_dst);
    assert(initial_src==1);
    assert(final_dst==4);

    find_trip_endpoints(test3, sizeof(test3)/sizeof(*test3), &initial_src, &final_dst);
    assert(initial_src==1);
    assert(final_dst==4);

    find_trip_endpoints(test4, sizeof(test4)/sizeof(*test4), &initial_src, &final_dst);
    assert(initial_src==4);
    assert(final_dst==1);

    find_trip_endpoints(test5, sizeof(test5)/sizeof(*test5), &initial_src, &final_dst);
    assert(initial_src==1);
    assert(final_dst==2);

    return 0;
}

0 讨论(0)

不要未来只要你来

2020-12-12 17:51
Easiest way is with hash tables, but that doesn't have the best worst-case complexity (O(n²))

Instead:
1. Create a bunch of nodes containing (src, dst) O(n)
2. Add the nodes to a list and sort by src O(n log n)
3. For each (destination) node, search the list for the corresponding (source) node O(n log n)
4. Find the start node (for instance, using a topological sort, or marking nodes in step 3) O(n)
Overall: O(n log n)

(For both algorithms, we assume the length of the strings is negligible ie. comparison is O(1))
0 讨论(0)
发布评论:

提交评论
- 加载中...

星月不相逢

2020-12-12 17:51

I have written a small python program, uses two hash tables one for count and another for src to dst mapping. The complexity depends on the implementation of the dictionary. if dictionary has O(1) then complexity is O(n) , if dictionary has O( lg(n) ) like in STL map, then complexity is O( n lg(n) )

import random
# actual journey: a-> b -> c -> ... g -> h
journey = [('a','b'), ('b','c'), ('c','d'), ('d','e'), ('e','f'), ('f','g'), ('g','h')]
#shuffle the journey.
random.shuffle(journey)
print("shffled journey : ", journey )
# Hashmap to get the count of each place
map_count = {}
# Hashmap to find the route, contains src to dst mapping
map_route = {}

# fill the hashtable
for j in journey:
    source = j[0]; dest = j[1]
    map_route[source] = dest
    i = map_count.get(source, 0)
    map_count[ source ] = i+1
    i = map_count.get(dest, 0)
    map_count[ dest ] = i+1

start = ''
# find the start point: the map entry with count = 1 and 
# key exists in map_route.
for (key,val) in map_count.items():
    if map_count[key] == 1 and map_route.has_key(key):
        start = key
        break

print("journey started at : %s" % start)
route = [] # the route
n = len(journey)  # number of cities.
while n:
    route.append( (start, map_route[start]) )
    start = map_route[start]
    n -= 1

print(" Route : " , route )

0 讨论(0)

广开言路

2020-12-12 17:54
No need for hashes or something alike. The real input size here is not necessarily the number of tickets (say n), but the total 'size' (say N) of the tickets, the total number of char needed to code them.

If we have a alphabet of k characters (here k is roughly 42) we can use bucketsort techniques to sort an array of n strings of a total size N that are encoded with an alphabet of k characters in O(n + N + k) time. The following works if n <= N (trivial) and k <= N (well N is billions, isn't it)
1. In the order the tickets are given, extract all airport codes from the tickets and store them in a struct that has the code as a string and the ticket index as a number.
2. Bucketsort that array of structs according to their code
3. Run trough that sorted array and assign an ordinal number (starting from 0) to each newly encountered airline code. For all elements with the same code (they are consecutive) go to the ticket (we have stored the number with the code) and change the code (pick the right, src or dst) of the ticket to the ordinal number.
4. During this run through the array we may identify original source src0.
5. Now all tickets have src and dst rewritten as ordinal numbers, and the tickets may be interpreted as one list starting in src0.
6. Do a list ranking (= toplogical sort with keeping track of the distance from src0) on the tickets.
0 讨论(0)
发布评论:

提交评论
- 加载中...

孤独总比滥情好

2020-12-12 17:56

Create two data structures:

Route
{
  start
  end
  list of flights where flight[n].dest = flight[n+1].src
}

List of Routes

And then:

foreach (flight in random set)
{
  added to route = false;
  foreach (route in list of routes)
  {
    if (flight.src = route.end)
    {
      if (!added_to_route)
      {
        add flight to end of route
        added to route = true
      }
      else
      {
        merge routes
        next flight
      }
    }
    if (flight.dest = route.start)
    {
      if (!added_to_route)
      {
        add flight to start of route
        added to route = true
      }
      else
      {
        merge routes
        next flight
      }
    }
  }
  if (!added to route)
  {
    create route
  }
}

0 讨论(0)

上一页 1 2 3

One-way flight trip problem

Prerequisites

Constructing subtrips

Concatenating subtrips