Vector storage in C++

核能气质少年 提交于 2019-12-20 11:01:04

问题


I wish to store a large vector of d-dimensional points (d fixed and small: <10).

If I define a Point as vector<int>, I think a vector<Point> would store in each position a pointer to a Point.

But if define a Point as a fixed-size object like: std::tuple<int,int,...,int> or std::array<int, d>, will the program store all points in contiguous memory or will the additional level of indirection remain?

In case the answer is that arrays avoid the additional indirection, could this have a large impact on performance (cache exploit locality) while scanning the vector<Point>?


回答1:


If you define your Point as having contiguous data storage (e.g. struct Point { int a; int b; int c; } or using std::array), then std::vector<Point> will store the Points in contiguous memory locations, so your memory layout will be:

p0.a, p0.b, p0.c, p1.a, p1.b, p1.c, ..., p(N-1).a, p(N-1).b, p(N-1).c

On the other hand, if you define Point as a vector<int>, then a vector<Point> has the layout of vector<vector<int>>, which is not contiguous, as vector stores pointers to dynamically allocated memory. So you have contiguity for single Points, but not for the whole structure.

The first solution is much more efficient than the second (as modern CPUs love accessing contiguous memory locations).




回答2:


vector will store whatever your type contains in contiguous memory. So yes, if that's an array or a tuple, or probably even better, a custom type, it will avoid indirection.

Performance-wise, as always, you have to measure it. Don't speculate. At least as far as scanning is concerned.

However, there will definitely be a huge performance gain when you create those points in the first place, because you'll avoid unnecessary memory allocations for every vector that stores a point. And memory allocations are usually very expensive in C++.




回答3:


For the said value of d (<10), defining Point as vector<int> will almost double the full memory usage by std::vector<Point> and will bring almost no advantage.




回答4:


As the dimension is fixed, I'd suggest you to go with a template which uses the dimension as a template param. Something like this:

template <typename R, std::size_t N> class ndpoint 
{
public:
  using elem_t=
    typename std::enable_if<std::is_arithmetic<R>::value, R>::type;

  static constexpr std::size_t DIM=N;

  ndpoint() = default;

  // e.g. for copying from a tuple
  template <typename... coordt> ndpoint(coordt... x) : elems_ {static_cast<R>(x)...} {
  }
  ndpoint(const ndpoint& other) : elems_() {
    *this=other;
  }

  template <typename PointType> ndpoint(const PointType& other) : elems_() {
    *this = other;
  }

  ndpoint& operator=(const ndpoint& other) {
    for(size_t i=0; i<N; i++) {
      this->elems_[i]=other.elems_[i];
    }
    return *this;
  }

  // this will allow you to assign from any source which defines the
  // [](size_t i) operator
  template <typename PointT> ndpoint& operator=(const PointT& other) {
    for(size_t i=0; i<N; i++) {
      this->elems_[i]=static_cast<R>(other[i]);
    }
  }

  const R& operator[](std::size_t i) const { return this->elems_[i]; }

  R& operator[](std::size_t i) { return this->elems_[i]; }

private:
  R elems_[N];
};

Then use a std::vector<ndpoint<...>> for a collection of points for best performance.




回答5:


The only way to be 100% sure how your data is structured is to fully implement own memory handling..

However, there are many libraries that implement matrices and matrix operations that you can check out. Some have documented information about contiguous memory, reshape etc. (e.g. OpenCV Mat).

Note that in general you can not trust that an array of Points will be contiguous. This is due to alignment, allocation block header etc. For example consider

struct Point {
   char x,y,z;
};

Point array_of_points[3];

Now if you try to 'reshape', that is, iterate between Point elements relaying on the fact that points are adjacent in the container - than it is most likely to fail:

(char *)(&array_of_points[0].z) != (char *)(&array_of_points[1].x)


来源:https://stackoverflow.com/questions/40302857/vector-storage-in-c

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!