How much memory do I need to have for 100 million records

痞子三分冷 提交于 2019-12-11 09:36:58

问题


How much memory do i need to load 100 million records in to memory. Suppose each record needs 7 bytes. Here is my calculation

each record = <int> <short> <byte>
4  +  2  + 1 = 7 bytes

needed memory in GB = 7 * 100 * 1,000,000 / 1000,000,000 = 0.7 GB

Do you see any problem with this calculation?


回答1:


With 100,000,000 records, you need to allow for overhead. Exactly what and how much overhead you'll have will depend on the language.

In C/C++, for example, fields in a structure or class are aligned onto specific boundaries. Details may vary depending on the compiler, but in general int's must begin at an address that is a multiple of 4, short's at a multiple of 2, char's can begin anywhere.

So assuming that your 4+2+1 means an int, a short, and a char, then if you arrange them in that order, the structure will take 7 bytes, but at the very minimum the next instance of the structure must begin at a 4-byte boundary, so you'll have 1 pad byte in the middle. I think, in fact, most C compilers require structs as a whole to begin at an 8-byte boundary, though in this case that doesn't matter.

Every time you allocate memory there's some overhead for allocation block. The compiler has to be able to keep track of how much memory was allocated and sometimes where the next block is. If you allocate 100,000,000 records as one big "new" or "malloc", then this overhead should be trivial. But if you allocate each one individually, then each record will have the overhead. Exactly how much that is depends on the compiler, but, let's see, one system I used I think it was 8 bytes per allocation. If that's the case, then here you'd need 16 bytes for each record: 8 bytes for block header, 7 for data, 1 for pad. So it could easily take double what you expect.

Other languages will have different overhead. The easiest thing to do is probably to find out empirically: Look up what the system call is to find out how much memory you're using, then check this value, allocate a million instances, check it again and see the difference.




回答2:


If you really need just 7 bytes per structure, then you are almost right.

For memory measurements, we usually use the factor of 1024, so you would need

700 000 000 / 1024³ = 667,57 MiB = 0,652 GiB


来源:https://stackoverflow.com/questions/14614689/how-much-memory-do-i-need-to-have-for-100-million-records

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!