Can somebody please demonstrate for me a more efficient Cartesian product algorithm than the one I am using currently (assuming there is one). I\'ve looked around SO and go
If cache locality (or local memory required to maintain the j's) is a problem, you can make your algorithm more cache-friendly by bisecting the input arrays recursively. Something like:
cartprod(is,istart,ilen, js,jstart,jlen) {
if(ilen <= IMIN && jlen <= JMIN) { // base case
for(int i in is) {
for(int j in js) {
// pair i and j
}
}
return;
}
if(ilen > IMIN && jlen > JMIN) { // divide in 4
ilen2= ilen>>1;
jlen2= jlen>>1;
cartprod(is,istart,ilen2, js,jstart,jlen2);
cartprod(is,istart+ilen2,ilen-ilen2, js,jstart,jlen2);
cartprod(is,istart+ilen2,ilen-ilen2, js,jstart+jlen2,jlen-jlen2);
cartprod(is,istart,ilen2, js,jstart+jlen2,jlen-jlen2);
return;
}
// handle other cases...
}
Note that this access pattern will automatically take fairly good advantage of all levels of automatic cache; this kind of technique is called cache-oblivious algorithm design.