Why is my rather trivial CUDA program erring with certain arguments?

僤鯓⒐⒋嵵緔 提交于 2019-12-13 00:38:09

问题


I made a simple CUDA program for practice. It simply copies over data from one array to another:

import pycuda.driver as cuda
import pycuda.autoinit
import numpy as np
from pycuda.compiler import SourceModule

# Global constants
N = 2**20 # size of array a
a = np.linspace(0, 1, N)
e = np.empty_like(a)
block_size_x = 512

# Instantiate block and grid sizes.
block_size = (block_size_x, 1, 1)
grid_size = (N / block_size_x, 1)

# Create the CUDA kernel, and run it.
mod = SourceModule("""
  __global__ void D2x_kernel(double* a, double* e, int N) {
    int tid = blockDim.x * blockIdx.x + threadIdx.x;
    if (tid > 0 && tid < N - 1) {
      e[tid] = a[tid];
    }
  }
""")
func = mod.get_function('D2x_kernel')
func(a, cuda.InOut(e), np.int32(N), block=block_size, grid=grid_size)
print str(e) 

However, I get this error: pycuda._driver.LogicError: cuLaunchKernel failed: invalid value

When I get rid of the second argument double* e in my kernel function and invoke the kernel without the argument e, the error goes away. Why is that? What does this error mean?


回答1:


Your a array does not exist in device memory, so I suspect that PyCUDA is ignoring (or otherwise handling) the first argument to your kernel invocation and only passing in e and N...so you get an error because the kernel was expecting three arguments and it has only received two. Removing double* e from your kernel definition might eliminate the error message you're getting, but your kernel still won't work properly.

A quick fix to this should be to wrap a in a cuda.In() call, which instructs PyCUDA to copy a to the device before launching the kernel. That is, your kernel launch line should be:

func(cuda.In(a), cuda.InOut(e), np.int32(N), block=block_size, grid=grid_size)

Edit: Also, do you realize that your kernel is not copying the first and last elements of a to e? Your if (tid > 0 && tid < N - 1) statement is preventing that. For the entire array, it should be if (tid < N).



来源:https://stackoverflow.com/questions/13186200/why-is-my-rather-trivial-cuda-program-erring-with-certain-arguments

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!