How to fast initialize with 1 really big array

匿名 (未验证) 提交于 2019-12-03 03:08:02

问题:

I have enermous array:

int* arr = new int[BIGNUMBER]; 

How to fullfil it with 1 number really fast. Normally I would do

for(int i = 0; i < BIGNUMBER; i++)     arr[i] = 1 

but I think it would take long.

Can I use memcpy or similar?

回答1:

You could try using the standard function std::uninitialized_fill_n:

#include <memory>  // ...  std::uninitialized_fill_n(arr, BIGNUMBER, 1); 

In any case, when it comes to performance, the rule is to always make measurements to back up your assumptions - especially if you are going to abandon a clear, simple design to embrace a more complex one because of an alleged performance improvement.

EDIT:

Notice that - as Benjamin Lindley mentioned in the comments - for trivial types std::uninitialized_fill_n does not bring any advantage over the more obvious std::fill_n. The advantage would exist for non-trivial types, since std::uninitialized_fill would allow you to allocate a memory region and then construct objects in place.

However, one should not fall into the trap of calling std::uninitialized_fill_n for a memory region that is not uninitialized. The following, for instance, would give undefined behavior:

my_object* v = new my_object[BIGNUMBER]; std::uninitialized_fill_n(my_object, BIGNUMBER, my_object(42)); // UB! 


回答2:

Alternative to a dynamic array is std::vector<int> with the constructor that accepts an initial value for each element:

std::vector<int> v(BIGNUMBER, 1); // 'BIGNUMBER' elements, all with value 1. 

as already stated, performance would need measured. This approach provides the additional benefit that the memory will be freed automatically.



回答3:

Some possible alternatives to Andy Prowl's std::uninitialized_fill_n() solution, just for posterity:

  • If you are lucky and your value is composed of all the same bytes, memset will do the trick.
  • Some implementations offer a 16-bit version memsetw, but that's not everywhere.
  • GCC has an extension for Designated Initializers that can fill ranges.
  • I've worked with a few ARM systems that had libraries that had accelerated CPU and DMA variants of word-fill, hand coded in assembly -- you might look and see if your platform offers any of this, if you aren't terribly concerned about portability.
  • Depending on your processor, even looking into loops around SIMD intrinsics may provide a boost; some of the SIMD units have load/store pipelines that are optimized for moving data around like this. On the other hand you may take severe penalties for moving between register types.

Last but definitely not least, to echo some of the commenters: you should test and see. Compilers tend to be pretty good at recognizing and optimizing patterns like this -- you probably are just trading off portability or readability with anything other than the simple loop or uninitialized_fill_n.

You may be interested in prior questions:



回答4:

Under Linux/x86 gcc with optimizations turned on, your code will compile to the following:

rax = arr rdi = BIGNUMBER  400690: c7 04 90 01 00 00 00    movl   $0x1,(%rax,%rdx,4) 

Move immediate int(1) to rax + rdx

400697: 48 83 c2 01             add    $0x1,%rdx 

Increment register rdx

40069b: 48 39 fa                cmp    %rdi,%rdx 

Cmp rdi to rdx

40069e: 75 f0                   jne    400690 <main+0xa0> 

If BIGNUMBER has been reached jump back to start.

It takes about 1 second per gigabyte on my machine, but most of that I bet is paging in physical memory to back the uninitialized allocation.



回答5:

Just unroll the loop by, say, 8 or 16 times. Functions like memcpy are fast, but they're really there for convenience, not to be faster than anything you could possibly write:

for (i = 0; i < BIGNUMBER-8; i += 8){   a[i+0] = 1; // this gets rid of the test against BIGNUMBER, and the increment, on 7 out of 8 items.   a[i+1] = 1; // the compiler should be able to see that a[i] is being calculated repeatedly here   ...   a[i+7] = 1; } for (; i < BIGNUMBER; i++) a[i] = 1; 

The compiler might be able to unroll the loop for you, but why take the chance?



回答6:

Use memset or memcpy

memset(arr, 0, BIGNUMER);  


回答7:

Try using memset?

memset(arr, 1, BIGNUMBER); 

http://www.cplusplus.com/reference/cstring/memset/



回答8:

memset(arr, 1, sizeof(int) * BIGNUMBER);



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!