One-way flight trip problem

后端未结

关注

 18  1500

不要未来只要你来

You are going on a one-way indirect flight trip that includes ~~billions~~ an unknown very large number of transfers.

You are not stoppi

相关标签:

18条回答

醉话见心

2020-12-12 17:40
If you assume a joinable list structure that can store everything (probably on disk):
1. Create 2 empty hash tables S and D
2. grab the first element
3. look up its src in D
4. If found, remove the associated node from D and link it to the current node
5. If not found, insert the node into S keyed on src
6. repeat from 3 the other way src<->des, S<->D
7. repeat from 2 with the next node.
O(n) time. As for space, the birthday paradox (or something much like it) will keep your data set a lot smaller than the full set. In the bad luck case where it still gets to large (worst case is O(n)), you can evict random runs from the hash table and insert them at the end of the processing queue. Your speed could go to pot but as long as you can far excede the threashold for expecting collisions (~O(sqrt(n))) you should expect to see your dataset (the tables and input queue combined) regularly shrink.
0 讨论(0)
发布评论:

提交评论
- 加载中...
无人及你

2020-12-12 17:43

Construct a hashtable and add each airport into the hash table.

<key,value> = <airport, count>

Count for the airport increases if the airport is either the source or the destination. So for every airport the count will be 2 ( 1 for src and 1 for dst) except for the source and the destination of your trip which will have the count as 1.

You need to look at each ticket at least once. So complexity is O(n).

0 讨论(0)
发布评论:

提交评论
- 加载中...
被撕碎了的回忆

2020-12-12 17:43
Put in two Hashes: to_end = src -> des; to_beg = des -> src

Pick any airport as a starting point S.
```
while(to_end[S] != null)
   S = to_end[S];
```
S is now your final destination. Repeat with the other map to find your starting point.

Without properly checking, this feels O(N), provided you have a decent Hash table implementation.
0 讨论(0)
发布评论:

提交评论
- 加载中...
悲&欢浪女

2020-12-12 17:44

Construct two hash tables (or tries), one keyed on src and the other on dst. Choose one ticket at random and look up its dst in the src-hash table. Repeat that process for the result until you hit the end (the final destination). Now look up its src in the dst-keyed hash table. Repeat the process for the result until you hit the beginning.

Constructing the hash tables takes O(n) and constructing the list takes O(n), so the whole algorithm is O(n).

EDIT: You only need to construct one hash table, actually. Let's say you construct the src-keyed hash table. Choose one ticket at random and like before, construct the list that leads to the final destination. Then choose another random ticket from the tickets that have not yet been added to the list. Follow its destination until you hit the ticket you initially started with. Repeat this process until you have constructed the entire list. It's still O(n) since worst case you choose the tickets in reverse order.

Edit: got the table names swapped in my algorithm.

0 讨论(0)
发布评论:

提交评论
- 加载中...
感情败类

2020-12-12 17:45

Each airport is a node. Each ticket is an edge. Make an adjacency matrix to represent the graph. This can be done as a bit field to compress the edges. Your starting point will be the node that has no path into it (it's column will be empty). Once you know this you just follow the paths that exist.

Alternately you could build a structure indexable by airport. For each ticket you look up it's src and dst. If either is not found then you need to add new airports to your list. When each is found you set a the departure airport's exit pointer to point to the destination, and the destination's arrival pointer to point to the departure airport. When you are out of tickets you must traverse the entire list to determine who does not have a path in.

Another way would be to have a variable length list of mini-trips that you connect together as you encounter each ticket. Each time you add a ticket you see if the ends of any existing mini-trip match either the src or dest of you ticket. If not, then your current ticket becomes it's own mini-trip and is added to the list. If so then the new ticket is tacked on to the end(s) of the existing trip(s) that it matches, possibly splicing two existing mini-trips together, in which case it would shorten the list of mini-trips by one.

0 讨论(0)
发布评论:

提交评论
- 加载中...
广开言路

2020-12-12 17:47

A hash table won't work for large sizes (such as the billions in the original question); anyone who has worked with them knows that they're only good for small sets. You could instead use a binary search tree, which would give you complexity O(n log n).

The simplest way is with two passes: The first adds them all to the tree, indexed by src. The second walks the tree and collects the nodes into an array.

Can we do better? We can, if we really want to: we can do it in one pass. Represent each ticket as a node on a liked list. Initially, each node has null values for the next pointer. For each ticket, enter both its src and dest in the index. If there's a collision, that means that we already have the adjacent ticket; connect the nodes and delete the match from the index. When you're done, you'll have made only one pass, and have an empty index, and a linked list of all the tickets in order.

This method is significantly faster: it's only one pass, not two; and the store is significantly smaller (worst case: n/2 ; best case: 1; typical case: sqrt(n)), enough so that you might be able to actually use a hash instead of a binary search tree.

0 讨论(0)
发布评论:

提交评论
- 加载中...