问题
I want to measure the performance of different devices viz CPU and GPUs. This is my kernel code:
__kernel void dataParallel(__global int* A)
{
sleep(10);
A[0]=2;
A[1]=3;
A[2]=5;
int pnp;//pnp=probable next prime
int pprime;//previous prime
int i,j;
for(i=3;i<10;i++)
{
j=0;
pprime=A[i-1];
pnp=pprime+2;
while((j<i) && A[j]<=sqrt((float)pnp))
{
if(pnp%A[j]==0)
{
pnp+=2;
j=0;
}
j++;
}
A[i]=pnp;
}
}
However the sleep()
function doesnt work. I am getting the following error in buildlog:
<kernel>:4:2: warning: implicit declaration of function 'sleep' is invalid in C99
sleep(10);
builtins: link error: Linking globals named '__gpu_suld_1d_i8_trap': symbol multiply defined!
Is there any other way to implement the function. Also is there a way to record the time taken to execute this code snippet.
P.S. I have included #include <unistd.h>
in my host code.
回答1:
You dont need to use sleep in your kernel to measure the execution time.
There are two ways to measure the time. 1. Use opencl inherent profiling look here: cl api
get timestamps in your hostcode and compare them before and after execution. example:
double start = getTimeInMS(); //The kernel starts here clEnqueueNDRangeKernel(command_queue, kernel, 1, NULL, &tasksize, &local_size_in, 0, NULL, NULL) //wait for kernel execution clFinish(command_queue); cout << "kernel execution time " << (getTimeInMS() - start) << endl;
Where getTimeinMs() is a function that returns a double value of miliseconds: (windows specific, override with other implementation if you dont use windows)
static inline double getTimeInMS(){
SYSTEMTIME st;
GetLocalTime(&st);
return (double)st.wSecond * (double)1000 + (double)st.wMilliseconds;}
Also you want to:
#include <time.h>
For Mac it would be (could work on Linux as well, not sure):
static inline double getTime() {
struct timeval starttime;
gettimeofday(&starttime, 0x0);
return (double)starttime.tv_sec * (double)1000 + (double)starttime.tv_usec / (double)1000;}
来源:https://stackoverflow.com/questions/31203240/implement-sleep-in-opencl-c