In the interpreter for my experimental programming language I have a symbol table. Each symbol consists of a name and a value (the value can be e.g.: of type string, int, fu
Normally you'd use a symbol table to look up the variable given its name as it appears in the source. In this case, you only have the name to work with, so there's nowhere to store the cached position of the variable in the symbol table. So I'd say a map
is a good choice. The []
operator takes time proportional to the log of the number of elements in the map - if it turns out to be slow, you could use a hash map like std::tr1::unordered_map
.
Map's operator [] is O(log(n)), see wikipedia : http://en.wikipedia.org/wiki/Map_(C%2B%2B)
I think as you're looking often for symbols, using a map is certainly right. Maybe a hash map (std::unordered_map) could make your performance better.
You say: "If the variable, still using vector, is found the first time, I can store its exact integer position in the vector with it.".
You can do the same with the map: search the variable using find
and store the iterator
pointing to it instead of the position.
A map will scale much better, which will be an important feature. However, don't forget that when using a map, you can (unlike a vector) take pointers and references. In this case, you could easily "cache" variables with a map just as validly as a vector. A map is almost certainly the right choice here.
For looking up values, by a string key, map data type is the appropriate one, as mentioned by other users.
STL map implementations usually are implemented with self-balancing trees, like the red black tree data structure, and their operations take O(logn) time.
My advice is to wrap the table manipulation code in functions,
like table_has(name)
, table_put(name)
and table_get(name)
.
That way you can change the inner symbol table representation easily if you experience
slow run time performance, plus you can embed in those routines cache functionality later.
You effectively have a number of alternatives.
Libraries exist:
vector
of pairs, faster than a map for small or frozen sets because of cache locality.Critics
O(log N)
, but the items may be scattered throughout the memory, thus not playing well with caching strategies.O(N)
performance on find
, is it acceptable ?unordered_map
? They provide O(1)
lookup and retrieval (though the constant may be high) and are certainly suited to this task. If you have a look at Wikipedia's article on Hash Tables you'll realize that there are many strategies available and you can certainly pick one that will suit your particular usage pattern.