This question has been bothering me for some time. The possibilities I am considering are
Does a
In most cases memcpy will be the fastest, as it is the lowest level and may be implemented in machine code on a given platform. (however, if your array contains non-trivial objects memcpy may not do the correct think, so it may be safer to stick with std::copy)
However it all depends on how well the stdlib is implanted on the given platform etc. As the standard does not say how fast operations must be, there is no way to know in a “portable” since what will be fastest.
Profiling your application will show the fasted on a given platform, but will only tell you about the test platform.
However, when you profile you application you will most likely find that the issues are in your design rather than your choose of array copy method. (E.g. why do you need to copy large arrays so match?)