CUDA C programming with 2 video cards
I am very new to CUDA programming and was reading the 'CUDA C Programming Guide' provided by nvidia. ( http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CUDA_C_Programming_Guide.pdf ) In the page 25, it has the following C code that does the matrix multiplication. Can you please tell me how can I make that code run on two devices? (if I have two nvida CUDA capable cards installed in my computer). Can you please show me with an example. // Matrices are stored in row-major order: // M(row, col) = *(M.elements + row * M.stride + col) typedef struct { int width; int height; int