gpu

OpenCL struct values correct on CPU but not on GPU

白昼怎懂夜的黑 提交于 2019-12-11 02:47:30
问题 I do have a struct in a file wich is included by the host code and the kernel typedef struct { float x, y, z, dir_x, dir_y, dir_z; int radius; } WorklistStruct; I'm building this struct in my c++ host code and passing it via a buffer to the OpenCL kernel. If I'm choosing an CPU device for computation I will get the following result: printf ( "item:[%f,%f,%f][%f,%f,%f]%d,%d\n", item.x, item.y, item.z, item.dir_x, item.dir_y, item.dir_z , item.radius ,sizeof(float)); Host: item:[20.169043,7

OpenGL: render time limit on linux

橙三吉。 提交于 2019-12-11 02:19:28
问题 I'm implementing some computation algorithm via OpenGL and Qt. All computations are executed in fragment shader. Sometimes when i trying to execute some hard computations (that takes more than 5 seconds on GPU) OpenGL breaks computation before it ends. I suppose this is system like TDR from Windows. I think that i should split input data by several parts but i need to know how long computation allowed. How i can obtain render time limit on linux (it will be cool if there is crossplatform

Build Successful but not running on simulator

那年仲夏 提交于 2019-12-11 02:05:09
问题 I download the code of Brad Larson from here. When I run it. It shows the build successful but it's not run in simulator. please direct me in right direction. I check the method in app delegate file - (BOOL)application:(UIApplication *)application didFinishLaunchingWithOptions:(NSDictionary *)launchOptions{ ....... } not called. Thanks 回答1: The scheme you're trying to run is probably not set correctly (it wasn't for me when i downloaded the code either). Just change the scheme to the one you

Race Condition in CUDA programs

戏子无情 提交于 2019-12-11 01:36:51
问题 I have two pieces of code. One written in C and the corresponding operation written in CUDA. Please help me understand how __syncthreads() works in context of the following programs. As per my understanding, __syncthreads() ensures synchronization of threads limited to one block. C program : { for(i=1;i<10000;i++) { t=a[i]+b[i]; a[i-1]=t; } } ` The equivalent CUDA program : ` __global__ void kernel0(int *b, int *a, int *t, int N) { int b0=blockIdx.x; int t0=threadIdx.x; int tid=b0*blockDim.x

Hardware Accelerated Image Scaling in windows using C++

坚强是说给别人听的谎言 提交于 2019-12-11 00:36:20
问题 I have to scale a bitmap image (e.g 1280 x 720 to 1920 X 180 and vice versa). I am using this scaling in video capturing from screen. Software based scaling consumes lots of CPU scaling and slower as well. Is there any hardware accelerated API or library to perform scaling? Some methods are discussed in thread How to use hardware video scalers?. Buts no final conclusion. Support Needed : Windows 7 onwards 回答1: If you have a a IDirect3DTexture9 of the image to be scaled, you can use

how to find out the RAM and GPU information of my visitors?

為{幸葍}努か 提交于 2019-12-11 00:28:09
问题 I want to know how much RAM my visitors have and all the information available about their GPU. Is there any way to achieve this via JavaScript or maybe ActionScript (Flash)? 回答1: JavaScript, Browser extensions and Plugins are heavily sandboxed that they have limited, to no access to the system for security purposes. Only limited hardware can be accessed directly (with the user's consent), like camera and microphone for JavaScript's getUserMedia or Flash. The nearest you can get is to have

Using gpu::GpuMat in OpenCV C++

好久不见. 提交于 2019-12-10 23:46:58
问题 I would like to know how can I modify a gpu::GpuMat . In fact I would like to know if it is possible to use a gpu::GpuMat like a cv::Mat . I would like to do something like that: cv::namedWindow("Result"); cv::Mat src_host = cv::imread("lena.jpg", CV_LOAD_IMAGE_GRAYSCALE); cv::gpu::GpuMat dst, src; src.upload(src_host); for (unsigned int y = 0; y < src.rows; y++){ for (unsigned int x = 0; x < src.cols; x++){ src.at<uchar>(y,x) = 0; } } cv::Mat result_host; dst.download(result_host); cv:

Using multiple_gpu_model on keras - causing resource exhaustion

邮差的信 提交于 2019-12-10 23:36:34
问题 I built my network the following way: # Build U-Net model inputs = Input((IMG_HEIGHT, IMG_WIDTH, IMG_CHANNELS)) s = Lambda(lambda x: x / 255) (inputs) width = 64 c1 = Conv2D(width, (3, 3), activation='relu', padding='same') (s) c1 = Conv2D(width, (3, 3), activation='relu', padding='same') (c1) p1 = MaxPooling2D((2, 2)) (c1) c2 = Conv2D(width*2, (3, 3), activation='relu', padding='same') (p1) c2 = Conv2D(width*2, (3, 3), activation='relu', padding='same') (c2) p2 = MaxPooling2D((2, 2)) (c2) c3

How to give an option to select graphics adapter in a DirectX 11 application?

我怕爱的太早我们不能终老 提交于 2019-12-10 22:57:37
问题 I think I know how it should work - only it does not. I have a lenovo laptop with a 860m and an intel integrated card. I can run my application from outside with both gpu, and everything works fine: the selected gpu will be the adapter with index 0, it has the laptop screen as output, etc. However if I try to use the adapter with index 1 (if I run the app normally, that is the nvidia, if I run it with the nvidia gpu, that is the intel), IDXGIOutput::EnumOutputs does not find anything, so I

Profiling GPU usage in C#

…衆ロ難τιáo~ 提交于 2019-12-10 21:14:54
问题 I am writing a C# application that is GPU accelerated using EMGU's GpuInvoke method. I would like to profile my code and look at the load on the GPU and the amount of GPU memory I'm using, but I'm having trouble finding a good way to do that. It seems like it should be simple, but I can't figure out what I'm missing. Thank you 回答1: Some options: using Performance Monitor perfmon.exe which is the easiest tool to use using tools like GPUZ using a performance kit from the GPU hardware vendor