I\'ve written two equivalent methods:
static bool F(T a, T b) where T : class
{
return a == b;
}
static bool F2(A a, A b)
{
return a == b;
I've done performance analysis in a professional capacity several times in my career, and have a couple of observations.
I once worked on a compiler team that had a big audacious performance goal. One build introduced an optimization that eliminated several instructions for a particular sequence. It should have improved performance, but instead the performance of one benchmark fell dramatically. We were running on hardware with a direct mapped cache. It turns out that the code for the loop and the function called in the inner loop occupied the same cache line with the new optimization in place, but did not with the prior generated code. In other words, that benchmark was really a memory benchmark, and entirely dependent on memory cache hits and misses, whereas the authors thought they had written a computational benchmark.