问题
I have an issue with a C++ program. I think it's a problem of memory. In my program i'm used to create some enormous std::vector (i use reserve to allocate some memory). With vector size of 1 000 000, it's ok but if i increase this number (about ten millions), my program will freeze my PC and i can do nothing except waiting for a crash (or end of the program if i'm lucky). My vector contains a structure called Point which contain a vector of double.
I used valgrind to check if there is a memory lack. But no. According to it, there is no problem. Maybe using a vector of objects is not advised ? Or maybe is there some system parameters to check or something ? Or simply, the vector is too big for the computer ?
What do you think about this ?
回答1:
Disclaimer
Note that this answer assumes a few things about your machine; the exact memory usage and error potential depends on your environment. And of course it is even easier to crash when you don't compute on 2d-Points, but e.g. 4d-points, which are common in computer graphics for example, or even larger Points for other numeric purposes.
About your problem
That's quite a lot of memory to allocate:
#include <iostream>
#include <vector>
struct Point {
std::vector<double> coords;
};
int main () {
std::cout << sizeof(Point) << std::endl;
}
This prints 12
, which is the size in bytes of an empty Point
. If you have 2-dimensional points, add another 2*sizeof(double)=8
to that per element, i.e. you now have a total of 20 bytes per Point
.
With 10s of millions of elements, you request 200s of millions of bytes of data, e.g. for 20 million elements, you request 400 million bytes. While this does not exceed the maximum index into an std::vector
, it is possible that the OS does not have that much contiguous memory free for you.
Also, your vector
s memory needs to be copied quite often in order to be able to grow. This happens for example when you push_back
, so when you already have a 400MiB vector
, upon the next push_back
you might have your old version of the vector
, plus the newly allocated 400MiB*X memory, so you may easily exceed the 1000MiB temporarilly, et cetera.
Optimizations (high level; preferred)
Do you need to actually store the data all time? Can you use a similar algorithm which does not require so much storage? Can you refactor your code so that storage is reduced? Can you core some data out when you know it will take some time until you need it again?
Optimizations (low level)
If you know the number of elements before creating your outer vector, use the std::vector
constructor which you can tell an initial size:
vector<Foo> foo(12) // initialize have 12 elements
Of course you can optimize a lot for memory; e.g. if you know you always only have 2d-Points, just have two double
s as members: 20 bytes -> 16 bytes. When you do not really need the precision of double
, use float
: 16 bytes -> 8 bytes. That's an optimization to $2/5$:
// struct Point { std::vector<double> coords; }; <-- old
struct Point { float x, y; }; // <-- new
If this is still not enough, an ad-hoc solution could be std::deque
, or another, non-contiguous container: No temporal memory "doubling" because no resizing needed; also no need for the OS to find you such contiguous block of memory.
You can also use compression mechanisms, or indexed data, or fixed point numbers. But it depends on your exact circumstances.
struct Point { signed char x, y; }; // <-- or even this? examine a proper type
struct Point { short x_index, y_index; };
回答2:
Without seeing your code, this is just speculation, but I suspect it is in large part due to your attempt to allocate a massive amount of memory that is contiguous. std::vector
is guaranteed to be in contiguous memory, so if you try to allocate a large amount of space, the operating system has to try to find a block of memory that large that it can use. This may not be a problem for 2MB, but if you are suddenly trying to allocate 200MB or 2GB of contiguous memory ...
Additionally, anytime you add a new element to the vector and it is forced to resize, all of the existing elements must be copied into the new space allocated. If you have 9 million elements and adding the 9,000,001 element requires a resize, that is 9 million elements that have to be moved. As your vector gets larger, this copy time takes longer.
Try using std::deque
instead. It is will basically allocate pages (that will be contiguous), but each page can be allocated wherever it can fit.
来源:https://stackoverflow.com/questions/20475295/freeze-in-c-program-using-huge-vector