To expand on the other answers that this is due to cache-misses for the second program, and assuming that you are using Linux, *BSD, or MacOS, then Cachegrind may give you enlightenment. It's part of valgrind, and will run your program, without changes, and print the cache usage statistics. It does run very slowly though.
http://valgrind.org/docs/manual/cg-manual.html