GPU Shared Memory Bank Conflict
问题 I am trying to understand how bank conflicts take place. if i have an array of size 256 in global memory and i have 256 threads in a single Block, and i want to copy the array to shared memory. therefore every thread copies one element. shared_a[threadIdx.x]=global_a[threadIdx.x] does this simple action result in a bank conflict? suppose now that the size of the array is larger than the number of threads, so i am now using this to copy the global memory to the shared memory: tid = threadIdx.x