Efficiency of Multithreaded Loops
Greetings noble community, I want to have the following loop: for(i = 0; i < MAX; i++) A[i] = B[i] + C[i]; This will run in parallel on a shared-memory quad-core computer using threads. The two alternatives below are being considered for the code to be executed by these threads, where tid is the id of the thread: 0, 1, 2 or 3. (for simplicity, assume MAX is a multiple of 4) Option 1: for(i = tid; i < MAX; i += 4) A[i] = B[i] + C[i]; Option 2: for(i = tid*(MAX/4); i < (tid+1)*(MAX/4); i++) A[i] = B[i] + C[i]; My question is if there's one that is more efficient then the other and why? The