performance of emplace is worse than check followed by emplace

I have a std::unordered_map with a value_type that does not have a default constructor so I cannot do the following

auto k = get_key();
auto& v = my_map[k];

I ended up writing a helper function

value_type& get_value(key_type& key)
{
    return std::get<0>(my_map.emplace(
                              std::piecewise_construct,
                              std::forward_as_tuple(key),
                              std::forward_as_tuple(args_to_construct_value)
                      ))->second;
}

but the performance was markedly worse (i.e. the value_type's constructor showed up in perf) than the following version.

value_type& get_value(key_type& key)
{
    auto it = my_map.find(key);
    if (it == my_map.end())
        return std::get<0>(my_map.emplace(
                                  std::piecewise_construct,
                                  std::forward_as_tuple(key),
                                  std::forward_as_tuple(args_to_construct_value)
                          ))->second;
    else
        return it->second;
}

I read from std::unordered_map::emplace object creation that emplace needs to construct the object in order to see if exists. But emplace is checking to see if this key value pair exists in the map before it returns.

Am I using emplace the wrong way? Is there a better pattern I should follow that:

Won't construct my value_type every lookup (as in my first method)
Won't do the check for to see if value_type exists in my map twice (as in my second method)

Thanks

Your code is unfortunately optimal for the standard library as it currently is.

The problem is that the emplace operation is designed to avoid copying, not to avoid unnecessary construction of the mapped type. In practical terms, what happens is that the implementation allocates and constructs a node, containing the map value_type i.e. pair<const Key, T>, and then hashes the key to determine whether the constructed node can be linked into the container; if this collides then the node is deleted.

As long as hash and equal_to are not too expensive, your code shouldn't do too much extra work.

An alternative is to use a custom allocator that intercepts 0-argument construction of your mapped type; the problem is that detecting such construction is pretty fiddly:

#include <unordered_map>
#include <iostream>

using Key = int;
struct D {
    D() = delete;
    D(D const&) = delete;
    D(D&&) = delete;
    D(std::string x, int y) { std::cout << "D(" << x << ", " << y << ")\n"; }
};
template<typename T>
struct A {
    using value_type = T;
    using pointer = T*;
    using const_pointer = T const*;
    using reference = T&;
    using const_reference = T const&;
    template<typename U> struct rebind { using other = A<U>; };
    value_type* allocate(std::size_t n) { return std::allocator<T>().allocate(n); }
    void deallocate(T* c, std::size_t n) { std::allocator<T>().deallocate(c, n); }
    template<class C, class...Args> void construct(C* c, Args&&... args) { std::allocator<T>().construct(c, std::forward<Args>(args)...); }
    template<class C> void destroy(C* c) { std::allocator<T>().destroy(c); }

    std::string x; int y;
    A(std::string x, int y): x(std::move(x)), y(y) {}
    template<typename U> A(A<U> const& other): x(other.x), y(other.y) {}
    template<class C, class...A> void construct(C* c, std::piecewise_construct_t pc, std::tuple<A...> a, std::tuple<>) {
        ::new((void*)c)C(pc, a, std::tie(x, y)); }
};

int main() {
    using UM = std::unordered_map<Key, D, std::hash<Key>, std::equal_to<Key>, A<std::pair<const Key, D>>>;
    UM um(0, UM::hasher(), UM::key_equal(), UM::allocator_type("hello", 42));
    um[5];
}

You could use boost::optional<T> in order to be able to default construct the mapped type and then assign an initialized T to it later.

#include <cassert>
#include <unordered_map>
#include <boost/optional.hpp>

struct MappedType
{
  explicit MappedType(int) {}
};

int main()
{
  std::unordered_map<int, boost::optional<MappedType>> map;
  boost::optional<MappedType>& opt = map[0];
  assert(!opt.is_initialized());
  opt = MappedType(2);
  assert(opt.is_initialized());
  MappedType& v = opt.get();
}

Sophit

James, you've mostly answered your own question.

You're doing nothing wrong in either implementation. emplace simply does more work than find, especially when the key already exists in your unordered_map.

If your get_value helper function mostly receives duplicates, then calling emplace every time will cause a performance hot spot as you've observed.

来源：https://stackoverflow.com/questions/24209592/performance-of-emplace-is-worse-than-check-followed-by-emplace

标签

c++

performance

c++11

unordered-map

emplace