I have a C++ snippet below with a run-time for
loop,
for(int i = 0; i < I; i++)
for (int j = 0; j < J; j++)
A( row(i,j), column(i,j)
I'm not a fan of template meta-programming, so you may want to take this answer with a pinch of salt. But, before I invested any time on this problem, I'd ask myself the following:
for
loop a bottleneck?In many compilers/cpus, the "looped" version can give better performance due to cache effects.
Remember: Measure first, optimise later - if at all.