suppose I have the following serial C:
int add(int* a, int* b, int n) { for(i=0; i
I am assuming you are working with n-by-n, row major order array. Try the following :
__global__ void calc(int *A, int *B, int n) { int i= blockIdx.x * blockDim.x + threadIdx.x; int j= blockIdx.y * blockDim.y + threadIdx.y; if (i