nvidia

Compute Prof's fields for incoherent and coherent gst/gld? (CUDA/OpenCL)

时光总嘲笑我的痴心妄想 提交于 2019-12-08 06:40:56
问题 I am using Compute Prof 3.2 and a Geforce GTX 280. I have compute capability 1.3 then I believe. This file, seems to show that I should be able to see these fields since I am using a 1.x compute device. Well I don't see them and the User Guide for 3.2 toolkit says I can't see them, but calls them gst_uncoalesced and gst_coalesced . To sum up, I am confused about how I should figure out from the profiler if I am making non-coalesced reads from global memory. It doesn't look like Fermi cards

Webgl flickering in Chrome on Windows x64 with nvidia GPU

跟風遠走 提交于 2019-12-08 02:50:16
问题 I see a weird flickering of some rendered geometry Chrome on Windows 10 x64 with nVidia chips. I've also tested in in Chrome for Linux, Firefox for both platforms, Android, and with Intel GPU. It works fine everywhere, except the one platform mentioned. Minimal example looks like this: Vertex shader: precision mediump float; smooth out vec2 pointCoord; const vec2 vertexCoord[] = vec2[]( vec2(0.0, 0.0), vec2(1.0, 0.0), vec2(1.0, 1.0), vec2(0.0, 0.0), vec2(1.0, 1.0), vec2(0.0, 1.0) ); void main

Compile and build .cl file using NVIDIA's nvcc Compiler?

亡梦爱人 提交于 2019-12-08 02:49:44
问题 Is it possible to compile .cl file using NVIDIA's nvcc compiler?? I am trying to set up visual studio 2010 to code Opencl under CUDA platform. But when I select CUDA C/C++ Compiler to compile and build .cl file, it gives me errors like nvcc does not exist. What is the issue? 回答1: You should be able to use nvcc to compile OpenCL codes. Normally, I would suggest using a filename extension of .c for a C-compliant code, and .cpp for a C++ compliant code(*), however nvcc has filename extension

Use Vulkan VkImage as a CUDA cuArray

房东的猫 提交于 2019-12-08 01:34:44
问题 What is the correct way of using a Vulkan VkImage as a CUDA cuArray? I've been trying to follow some examples, however I get a CUDA_ERROR_INVALID_VALUE on a call to cuExternalMemoryGetMappedMipmappedArray() To provide the information in an ordered way. I'm using CUDA 10.1 Base code comes from https://github.com/SaschaWillems/Vulkan, in particular I'm using the 01 - Vulkan Gears demo, enriched with the saveScreenshot method 09 - Capturing screenshots Instead of saving the snapshot image to a

What can I do against 'CUDA driver version is insufficient for CUDA runtime version'?

泄露秘密 提交于 2019-12-08 01:32:21
问题 When I go to /usr/local/cuda/samples/1_Utilities/deviceQuery and execute moose@pc09 /usr/local/cuda/samples/1_Utilities/deviceQuery $ sudo make clean rm -f deviceQuery deviceQuery.o rm -rf ../../bin/x86_64/linux/release/deviceQuery moose@pc09 /usr/local/cuda/samples/1_Utilities/deviceQuery $ sudo make "/usr/local/cuda-7.0"/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch

Handling Ctrl+C exception with GPU

断了今生、忘了曾经 提交于 2019-12-08 00:48:31
I am working with some GPU programs (using CUDA 4.1 and C), and sometimes (rarely) I have to kill the program midway using Ctrl+C to handle some exception. Earlier I tried using CudaDeviceReset() function, but this reply by talonmies displaced my trust in CudaDeviceReset() and hence I started handling such exceptions the Old-Fashioned way, that is 'computer restart'. As the project size grows, this method is becoming a headache. I would appreciate if anyone has come up with a better solution. harrism I think this question is more fundamental -- it is really an app design issue and not a CUDA

Disable Nvidia watchdog with OpenCL on Mac OS X 10.7.4

眉间皱痕 提交于 2019-12-08 00:44:58
问题 I have a OpenCL program which runs fine for small problems but when running larger problems exceeds the 8-10s time limit for running kernels on Nvidia hardware. Although I have no monitors attached to the GPU I am computing on (Nvidia GTX580), the kernel will always be terminated once it runs for around 8-10s. The preliminary research I did on this problem indicates that the Nvidia watchdog should only enforce the time limit if a monitor is connected to the graphics card. However I do not

Yocto for Nvidia Jetson fails because of GCC 7 - cannot compute suffix of object files

こ雲淡風輕ζ 提交于 2019-12-08 00:33:31
问题 I am trying to use Yocto with meta-tegra ( https://github.com/madisongh/meta-tegra ) to build a minimal system for the Nvidia Jetson Nano. I need to use CUDA ( current version 10 for Nano ) with OpenCV on this platform. CUDA 10 only support GCC 7, and not GCC 8. GCC 7 has be deprecated and removed from OpenEmbedded Warrior release in favor of GCC 8.3. My error has to do with trying to use GCC 7 with Warrior release of OE: configure: error: cannot compute suffix of object files: cannot compile

My GPU has 2 multiprocessors with 48 CUDA cores each. What does this mean?

点点圈 提交于 2019-12-07 20:41:39
问题 My GPU has 2 multiprocessors with 48 CUDA cores each. Does this mean that I can execute 96 thread blocks in parallel? 回答1: No it doesn't. From chapter 4 of the CUDA C programming guide: The number of blocks and warps that can reside and be processed together on the multiprocessor for a given kernel depends on the amount of registers and shared memory used by the kernel and the amount of registers and shared memory available on the multiprocessor. There are also a maximum number of resident

How much registers per thread does OpenCL kernel use on Nvidia GPU?

怎甘沉沦 提交于 2019-12-07 18:18:35
问题 My First Question is How to get registers used information for OpenCL kernel code on Nvidia GPU, as nvcc complier gives the same using nvcc --ptxas-options=-v flag for CUDA kernel code. I also got the same information on AMD GPU for OpenCL kernel, from .isa file generated while running the program, after exporting GPU_DUMP_DEVICE_KERNEL=3 . Same thing i also tried on Nvidia GPU but it did not get .isa file . My second question is that why Nvidia GPU not generating .isa file ? After googling I