When writing a Java program, do I have influence on how the CPU will utilize its cache to store my data? For example, if I have an array that is accessed a lot, does it help
If the data you're crunching is primarily or wholly made up of primitives (eg. in numeric problems), I would advise the following.
Allocate a flat structure of fixed size arrays-of-primitives at initialisation-time, and make sure the data therein is periodically compacted/defragmented (0->n where n is the smallest max index possible given your element count), to be iterated over using a for-loop. This is the only way to guarantee contiguous allocation in Java, and compaction further serves to improves locality of reference. Compaction is beneficial, as it reduces the need to iterate over unused elements, reducing the number of conditionals: As the for loop iterates, the termination occurs earlier, and less iteration = less movement through the heap = fewer chances for a cache miss. While compaction creates an overhead in and of itself, this may be done only periodically (with respect to your primary areas of processing) if you so choose.
Even better, you can interleave values in these pre-allocated arrays. For instance, if you are representing spatial transforms for many thousands of entities in 2D space, and are processing the equations of motion for each such, you might have a tight loop like
int axIdx, ayIdx, vxIdx, vyIdx, xIdx, yIdx;
//Acceleration, velocity, and displacement for each
//of x and y totals 6 elements per entity.
for (axIdx = 0; axIdx < array.length; axIdx += 6)
{
ayIdx = axIdx+1;
vxIdx = axIdx+2;
vyIdx = axIdx+3;
xIdx = axIdx+4;
yIdx = axIdx+5;
//velocity1 = velocity0 + acceleration
array[vxIdx] += array[axIdx];
array[vyIdx] += array[ayIdx];
//displacement1 = displacement0 + velocity
array[xIdx] += array[vxIdx];
array[yIdx] += array[vxIdx];
}
This example ignores such issues as rendering of those entities using their associated (x,y)... rendering always requires non-primitives (thus, references/pointers). If you do need such object instances, then you can no longer guarantee locality of reference, and will likely be jumping around all over the heap. So if you can split your code into sections where you have primitive-intensive processing as shown above, then this approach will help you a lot. For games at least, AI, dynamic terrain, and physics can be some of the most processor-intensives aspect, and are all numeric, so this approach can be very beneficial.