Given a file, find the ten most frequently occurring words as efficiently as possible

后端 未结 15 1696
予麋鹿
予麋鹿 2020-12-12 13:26

This is apparently an interview question (found it in a collection of interview questions), but even if it\'s not it\'s pretty cool.

We are told to do this efficien

15条回答
  •  陌清茗
    陌清茗 (楼主)
    2020-12-12 14:00

    I think the trie data structure is a choice.

    In the trie, you can record word count in each node representing frequency of word consisting of characters on the path from root to current node.

    The time complexity to setup the trie is O(Ln) ~ O(n) (where L is number of characters in the longest word, which we can treat as a constant). To find the top 10 words, we can traversal the trie, which also costs O(n). So it takes O(n) to solve this problem.

提交回复
热议问题