How to measure Memory Usage of std::unordered_map

问题

We know that hash table based container implementations like std::unordered_map use a lot memory but I don't know how much is how much?

Apart from space complexity notations, and not considering if a container element is a pointer to a larger object:

Is there any way to figure out how many bytes is being used by such a container during run-time?

Is there a way to tell, at runtime, how much memory any container uses?

回答1:

If you want to get a rough size, I think bucket_count() and max_load_factor() is enough, which gives the current count of buckets and the load factor.

Rational:

If load_factor <= 1, it indicates that bucket_count() >= the items in the map, then bucket_count() is the size of memory usage.
If load_factor > 1, then bucket_count() * load_factor indicates the max item in the map. Note this is the max size, not the real size.

So for a rough memory usage could look like this:

  unsigned n = mymap.bucket_count();
  float m = mymap.max_load_factor();
  if (m > 1.0) {
    return n * m;
  }
  else {
    return n;
  }

If you want to get the accurate memory usage, you may need to count all the buckets to see how many elements in it:

  size_t count = 0;
  for (unsigned i = 0; i < mymap.bucket_count(); ++i) {
    size_t bucket_size = mymap.bucket_size(i);
    if (bucket_size == 0) {
      count++;
    }
    else {
      count += bucket_size;
    }
  }

回答2:

There's no portable way to tell how many bytes are in use. What you can find out is:

size() indicates how many data elements have been inserted into the container
bucket_count() indicates how many buckets the underlying hash table has, each of which can be expected to host a linked list to the associated elements

Now:

bytes actually used for element storage will be m.size() * sizeof(M::value_type)
bytes used for the hash table buckets depends on the way the internal lists are stored - std::unordered_map::bucket_size has constant complexity so we might reasonably guess that there'll be a size() and head pointer per bucket, so m.bucket_count() * (sizeof(size_t) + sizeof(void*)) is a reasonable guess, though it may be that there's only constant amortised complexity due to the load_factor() being bounded and no size stored per bucket (I'd prefer implementing it this way myself)
if each of the inserted elements is part of the list, they'll need a next pointer, so we can add another m.size() * sizeof(void*)
each memory allocation may be rounded up to a size convenient for the memory allocation library's management - e.g. the next power of two, which approaches 100% worst case inefficiency and 50% average, so let's add 50%, just for the list nodes as the buckets are likely powers of two given size_t and a pointer: 50% * size() * (sizeof(void*) + sizeof((M::value_type))
especially in debug mode, there may be any amount of implementation-specific housekeeping and error-detection data

You can explore this further by creating a number of large tables and seeing how top or Process Manager reports different memory usage.

来源：https://stackoverflow.com/questions/25375202/how-to-measure-memory-usage-of-stdunordered-map

标签

c++

unordered-map