C++ STL Map vs Vector speed

后端 未结 12 1048
日久生厌
日久生厌 2020-12-16 11:56

In the interpreter for my experimental programming language I have a symbol table. Each symbol consists of a name and a value (the value can be e.g.: of type string, int, fu

相关标签:
12条回答
  • 2020-12-16 12:14

    Normally you'd use a symbol table to look up the variable given its name as it appears in the source. In this case, you only have the name to work with, so there's nowhere to store the cached position of the variable in the symbol table. So I'd say a map is a good choice. The [] operator takes time proportional to the log of the number of elements in the map - if it turns out to be slow, you could use a hash map like std::tr1::unordered_map.

    0 讨论(0)
  • 2020-12-16 12:18

    Map's operator [] is O(log(n)), see wikipedia : http://en.wikipedia.org/wiki/Map_(C%2B%2B)

    I think as you're looking often for symbols, using a map is certainly right. Maybe a hash map (std::unordered_map) could make your performance better.

    0 讨论(0)
  • 2020-12-16 12:18

    You say: "If the variable, still using vector, is found the first time, I can store its exact integer position in the vector with it.".

    You can do the same with the map: search the variable using find and store the iterator pointing to it instead of the position.

    0 讨论(0)
  • 2020-12-16 12:19

    A map will scale much better, which will be an important feature. However, don't forget that when using a map, you can (unlike a vector) take pointers and references. In this case, you could easily "cache" variables with a map just as validly as a vector. A map is almost certainly the right choice here.

    0 讨论(0)
  • 2020-12-16 12:20

    For looking up values, by a string key, map data type is the appropriate one, as mentioned by other users.

    STL map implementations usually are implemented with self-balancing trees, like the red black tree data structure, and their operations take O(logn) time.

    My advice is to wrap the table manipulation code in functions,
    like table_has(name), table_put(name) and table_get(name).

    That way you can change the inner symbol table representation easily if you experience
    slow run time performance, plus you can embed in those routines cache functionality later.

    0 讨论(0)
  • 2020-12-16 12:22

    You effectively have a number of alternatives.

    Libraries exist:

    • Loki::AssocVector: the interface of a map implemented over a vector of pairs, faster than a map for small or frozen sets because of cache locality.
    • Boost.MultiIndex: provides both List with fast lookup and an example of implementing a MRU List (Most Recently Used) which caches the last accessed elements.

    Critics

    • Map look up and retrieval take O(log N), but the items may be scattered throughout the memory, thus not playing well with caching strategies.
    • Vector are more cache friendly, however unless you sort it you'll have O(N) performance on find, is it acceptable ?
    • Why not using a unordered_map ? They provide O(1) lookup and retrieval (though the constant may be high) and are certainly suited to this task. If you have a look at Wikipedia's article on Hash Tables you'll realize that there are many strategies available and you can certainly pick one that will suit your particular usage pattern.
    0 讨论(0)
提交回复
热议问题