I like some features of D, but would be interested if they come with a runtime penalty?
To compare, I implemented a simple program that computes scalar products of m
dmd is the reference implementation of the language and thus most work is put into the frontend to fix bugs rather than optimizing the backend.
"in" is faster in your case cause you are using dynamic arrays which are reference types. With ref you introduce another level of indirection (which is normally used to alter the array itself and not only the contents).
Vectors are usually implemented with structs where const ref makes perfect sense. See smallptD vs. smallpt for a real-world example featuring loads of vector operations and randomness.
Note that 64-Bit can also make a difference. I once missed that on x64 gcc compiles 64-Bit code while dmd still defaults to 32 (will change when the 64-Bit codegen matures). There was a remarkable speedup with "dmd -m64 ...".