Sorted hash table (map, dictionary) data structure design

前端 未结 6 965
猫巷女王i
猫巷女王i 2020-12-29 00:14

Here\'s a description of the data structure:

It operates like a regular map with get, put, and remove methods, but has a

6条回答
  •  盖世英雄少女心
    2020-12-29 00:22

    For "O(1) get, put, and remove operations" you essentially need O(1) lookup, which implies a hash function (as you know), but the requirements of a good hash function often break the requirement to be easily sorted. (If you had a hash table where adjacent values mapped to the same bucket, it would degenerate to O(N) on lots of common data, which is a worse case you typically want a hash function to avoid.)

    I can think of how to get you 90% of the way there. Set up a hashtable alongside a parallel index that is sorted. The index has a clean part (ordered) and a dirty part (unordered). The index would map keys to the values (or references to the values stored in the hashtable - whichever suits you in terms of performance or memory use). When you add to the hashtable, the new entry is pushed onto the back of the dirty list. When you remove from the hashtable, the entry is nulled/removed from the clean and dirty parts of the index. You can sort the index, which sorts the dirty entries only, then merges them into the already sorted 'clean' part of the index. And obviously you can iterate over the index.

    As far as I can see, this gives you the O(1) everywhere except on the remove operation and is still fairly simple to implement with standard containers (at least as provided by C++, Java, or Python). It also gives you the "second sort is cheaper" condition by only needing to sort the dirty index entries and then letting you do an O(N) merge. The cost of all this is obviously extra memory for the index and extra indirection when using it.

提交回复
热议问题