Best block size value for block matrix matrix multiplication
问题 I want to do block matrix-matrix multiplication with the following C code.In this approach, blocks of size BLOCK_SIZE is loaded into the fastest cache in order to reduce memory traffic during calculation. void bMMikj(double **A , double **B , double ** C , int m, int n , int p , int BLOCK_SIZE){ int i, j , jj, k , kk ; register double jjTempMin = 0.0 , kkTempMin = 0.0; for (jj=0; jj<n; jj+= BLOCK_SIZE) { jjTempMin = min(jj+ BLOCK_SIZE,n); for (kk=0; kk<n; kk+= BLOCK_SIZE) { kkTempMin = min(kk