I'll start by illustrating a simple use case example:
Consider the problem of a social security ID database, where in C++ code is modelled as a
std::unordered_map
where its key is the social security ID of a person and its value is astd::string
with the full-name of that person (e.g.,std::unordered_map<int, std::string> DB;
).Consider also, that there's a request for printing this database sorted in ascending order based on the person's ID (i.e.,
std::unordered_map
's key).Naively, one would think to use
std::sort
in order to sort thestd::unordered_map
according to the requested criteria and then print it, like the example code below:
std::sort(DB.begin(), DB.end());
for(auto p : DB) std::cout << "ID(" << p.first
<< ") - "
<< p.second
<< std::endl;
- However, this is not the case, because use of
std::sort
with a range of either astd::unordered_map
or astd::unordered_set
will raise a compiler error.
Questions:
- Why STL's unordered containers cannot be sorted by
std::sort
? - Is there a legitimate and efficient way to sort either a
std::unordered_map
or astd::unordered_set
?
unordered
containers store internally hashed data and thus it's not possible to order them after the hash has been generated.
In order to sort the data you can use an additional non-hashed container (e.g. map or set) and either use them along with the unordered version (so you can use the normal one to sort the data and the unordered one to have fast per-item access) or you can do something like
std::map<int, int> ordered(unordered.begin(), unordered.end());
for(auto it = ordered.begin(); it != ordered.end(); ++it)
std::cout << it->second;
I recommend not to do the above often (unordered containers have slow sequential access)
Sorting only makes sense for sequence containers, which are containers whose elements are determined by the order in which they were added to the container. The dynamic sequence containers in the standard library are vector, deque, list and forward_list.
Maps and sets, on the other hand, are associative containers, in which elements are identified by their value. Thus it makes no sense to ask for an "ordering", since the container elements aren't arranged in any kind of sequence. (It's true that an ordered map can be iterated in a comparison order on the key, but that order emerges from the container; it is not provided by the user.)
1.Why STL's unordered containers cannot be sorted by
std::sort
?
Because unordered containers are already "sorted", albeit not directly by their keys, but by (typically) hash_function
(key) %
bucket_count
()
(also accessible as bucket
(key)
). This "sort" order isn't cosmetic - it's the whole basis on which hash tables are able to find elements quickly. If std::sort
were allowed to re-order the elements by key instead, then the container would no longer be able to function as a hash table: elements couldn't be reliably found or erased, insertions might put duplicates in the container etc..
2.Is there a legitimate and efficient way to sort either a
std::unordered_map
or astd::unordered_set
?
In the general case, only by first copying the elements to a sortable or sorted container such as std::vector
or std::set
(the former will usually be faster, but benchmark both if you really care):
std::unordered_set<T> my_set = ...;
std::vector<T> my_vec{my_set.begin(), my_set.end(), my_set.size()};
std::sort(my_vec.begin(), my_vec.end());
In your case with std::unordered_map<int, std::string> DB;
, I'd suggest copying only the int
keys to a vector
for sorting, then during iteration look up each key in the unordered_map
: that will avoid a lot of string
copying.
(It is sometimes possible to orchestrate an unordered container with ordering by key (e.g. hash function returns key, container presized so max bucket index >= max key value) but anyone considering such abuse would be better off using a vector
.)
来源:https://stackoverflow.com/questions/24212356/why-stl-unordered-map-and-unordered-set-cannot-be-sorted-by-stl-algorithms