I have some big arrays given by MATLAB to C++ (therefore I need to take them as they are) that needs casting and permuting (row-mayor, column mayor issues).
The arr
In terms of the algorithm you're using, I think you're always going to end up with three nested loops.
Two things to think about:
k * size_proj[1]i * size_proj[1]size_proj[0] * size_proj[1] (and j * size_proj[0] * size_proj[1] is used twice)