Find the largest k numbers in k arrays stored across k machines

前端 未结 7 873
悲哀的现实
悲哀的现实 2020-12-28 19:34

This is an interview question. I have K machines each of which is connected to 1 central machine. Each of the K machines have an array of 4 byte numbers in file. You can use

7条回答
  •  谎友^
    谎友^ (楼主)
    2020-12-28 20:12

    • Find the k largest numbers on each machine. O(n*log(k))
    • Combine the results (on a centralized server, if k is not huge, otherwise you can merge them in a tree-hierarchy accross the server cluster).

    Update: to make it clear, the combine step is not a sort. You just pick the top k numbers from the results. There are many ways to do this efficiently. You can use a heap for example, pushing the head of each list. Then you can remove the head from the heap and push the head from the list the element belonged to. Doing this k times gives you the result. All this is O(k*log(k)).

提交回复
热议问题