I need to dynamically allocate some arrays inside the kernel function. How can a I do that?
My code is something like that:
__global__ func(float *gr
Maybe you should test
cudaMalloc(&foo,sizeof(int) * ARRAY_SIZE * ITERATIONS); cudaFree(foo);
instead
for (int i = 0; i < ITERATIONS; ++ i) { if (i == 1) cuda_malloc_timer.start(); // let it warm up one cycle int * foo; cudaMalloc(&foo, sizeof(int) * ARRAY_SIZE); cudaFree(foo); }