For example:
a) int [x][y][z]
vs
b) int[x*y*z]
Initi
On current CPUs, non-cached memory access is hundreds of times slower than arithmetics (see this presentation and read What every programmer should know about memory). The a) option will result in about 3 memory lookups whereas the b) option will result in about 1 memory lookup. Also the CPU's prefetching algorithms might not work as well. So the b) option can be faster in some situations (it's a hot spot and the array does not fit into the CPU's cache). How much faster? - that will depend on the application.
Personally I would first use the a) option, because it will result in simpler code. If a profiler shows that array access to be a bottleneck, then I would convert it to the b) option, so that there is a pair of helper methods for reading and writing array values (that way the messy code will be restricted to those two methods).
I made a benchmark for comparing 3-dimensional int arrays ("Multi" column) to the equivalent 1-dimensional int arrays ("Single" column). The code is here and tests here. I ran it on 64-bit jdk1.6.0_18, Windows 7 x64, Core 2 Quad Q6600 @ 3.0 GHz, 4 GB DDR2, using the JVM options -server -Xmx3G -verbose:gc -XX:+PrintCompilation (I have removed the debug output from the following results). The results were:
Out of 20 repeats, the minimum time in milliseconds is reported.
Array dimensions: 100x100x100 (1000000)
Multi Single
Seq Write 1 1
Seq Read 1 1
Random Read 99 90 (of which generating random numbers 59 ms)
Array dimensions: 200x200x200 (8000000)
Multi Single
Seq Write 14 13
Seq Read 11 8
Random Read 1482 1239 (of which generating random numbers 474 ms)
Array dimensions: 300x300x300 (27000000)
Multi Single
Seq Write 53 46
Seq Read 34 24
Random Read 5915 4418 (of which generating random numbers 1557 ms)
Array dimensions: 400x400x400 (64000000)
Multi Single
Seq Write 123 111
Seq Read 71 55
Random Read 16326 11144 (of which generating random numbers 3693 ms)
This shows the 1-dimensional array to be faster. Though the differences are so small, that for 99% applications it won't be noticable.
I did also some measurements to estimate the overhead of generating the random numbers in the Random Read benchmark by replacing preventOptimizingAway += array.get(x, y, z); with preventOptimizingAway += x * y * z; and added the measurements to the above results table by hand. Generating the random numbers takes 1/3 or less of the total time of the Random Read benchmark, so the memory access dominates the benchmark as expected. It would be interesting to repeat this benchmark with arrays of 4 and more dimensions. Probably it would make the speed difference bigger, because the multidimensional array's topmost levels will fit into the CPU's cache, and only the other levels will require a memory lookup.