nvidia | 易学教程

External calls are not supported - CUDA

阅读更多关于 External calls are not supported - CUDA

Objective is to call a device function available in another file, when i compile the global kernel it shows the following error *External calls are not supported (found non-inlined call to _Z6GoldenSectionCUDA)* . Problematic Code (not the full code but where the problem arises), cat norm.h # ifndef NORM_H_ # define NORM_H_ # include<stdio.h> __device__ double invcdf(double prob, double mean, double stddev); #endif cat norm.cu # include <norm.h> __device__ double invcdf(double prob, double mean, double stddev) { return (mean + stddev*normcdfinv(prob)); } cat test.cu # include <norm.h> #

Why does vkGetPhysicalDeviceMemoryProperties return multiple identical memory types?

阅读更多关于 Why does vkGetPhysicalDeviceMemoryProperties return multiple identical memory types?

问题 So, I'm gathering some info about my device in Vulkan during initialization and find a unique (or rather, quite similar) set of memory types returned by vkGetPhysicalDeviceMemoryProperties: Device Name: GeForce GTX 1060 3GB Device ID: 7170 Device Type: 2 Device Vendor ID: 4318 Device API Version: 4194369 (1.0.65) Device Driver Version: 1636843520 (390.65) Device Heaps: 0 -> Size: 3133145088 Flags: 1 1 -> Size: 8523874304 Flags: 0 Device Memory: 0 -> Index: 1 Flags: 0 1 -> Index: 1 Flags: 0 2

CUDA fails when trying to use both onboard iGPU and Nvidia discrete card. How can i use both discrete nvidia and integrated (onboard) intel gpu? [closed]

阅读更多关于 CUDA fails when trying to use both onboard iGPU and Nvidia discrete card. How can i use both discrete nvidia and integrated (onboard) intel gpu? [closed]

问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 4 years ago . I had recently some trouble making my pc (ivybridge) use the onboard gpu (intel igpu HD4000) for normal screen display usage, while i run my CUDA programs for computations on the discrete Nvidia GT 640 i have on my machine. The problem was that under iGPU display, CUDA would be unable to spot the nvidia card ,

Strategies for timing CUDA Kernels: Pros and Cons?

阅读更多关于 Strategies for timing CUDA Kernels: Pros and Cons?

问题 When timing CUDA kernels, the following doesn't work because the kernel doesn't block the CPU program execution while it executes: start timer kernel<<<g,b>>>(); end timer I've seen three basic ways of (successfully) timing CUDA kernels: (1) Two CUDA eventRecords. float responseTime; //result will be in milliseconds cudaEvent_t start; cudaEventCreate(&start); cudaEventRecord(start); cudaEventSynchronize(start); cudaEvent_t stop; cudaEventCreate(&stop); kernel<<<g,b>>>(); cudaEventRecord(stop)

Is it possible to run CUDA on AMD GPUs?

阅读更多关于 Is it possible to run CUDA on AMD GPUs?

I'd like to extend my skill set into GPU computing. I am familiar with raytracing and realtime graphics(OpenGL), but the next generation of graphics and high performance computing seems to be in GPU computing or something like it. I currently use an AMD HD 7870 graphics card on my home computer. Could I write CUDA code for this? (my intuition is no, but since Nvidia released the compiler binaries I might be wrong). A second more general question is, Where do I start with GPU computing? I'm certain this is an often asked question, but the best I saw was from 08' and I figure the field has

Calculation on GPU leads to driver error “stopped responding”

阅读更多关于 Calculation on GPU leads to driver error “stopped responding”

I have this little nonsense script here which I am executing in MATLAB R2013b: clear all; n = 2000; times = 50; i = 0; tCPU = tic; disp 'CPU::' A = rand(n, n); B = rand(n, n); disp '::Go' for i = 0:times CPU = A * B; end tCPU = toc(tCPU); tGPU = tic; disp 'GPU::' A = gpuArray(A); B = gpuArray(B); disp '::Go' for i = 0:times GPU = A * B ; end tGPU = toc(tGPU); fprintf('On CPU: %.2f sec\nOn GPU: %.2f sec\n', tCPU, tGPU); Unfortunately after execution I receive a message from Windows saying: " Display driver stopped working and has recovered. ". Which I assume means that Windows did not get

CUDA GPU selected by position, but how to set default to be something other than device 0?

阅读更多关于 CUDA GPU selected by position, but how to set default to be something other than device 0?

问题 I've recently installed a second GPU (Tesla K40) on my machine at home and my searches have suggested that the first PCI slot becomes the default GPU chosen for CUDA jobs. A great link is explaining it can be found here: Default GPU Assignment My original GPU is a TITAN X, also CUDA enabled, but it's really best for single precision calculations and the Tesla better for double precision. My question for the group is whether there is a way to set up my default CUDA programming device to be the

Cuda Random Number Generation

阅读更多关于 Cuda Random Number Generation

I was wondering what was the best way to generate one pseudo random number between 0 and 49k that would be the same for each thread, by using curand or something else. I prefer to generate the random numbers inside the kernel because I will have to generate one at the time but about 10k times. And I could use floats between 0.0 and 1.0, but I've no idea how to make my PRN available for all threads, because most post and example show how to have different PRN for each threads. Thanks Probably you just need to study the curand documentation , especially for the device API . The key to getting

Testing GPU with tensorflow matrix multiplication

阅读更多关于 Testing GPU with tensorflow matrix multiplication

问题 As many machine learning algorithms rely to matrix multiplication(or at least can be implemented using matrix multiplication) to test my GPU is I plan to create matrices a , b , multiply them and record time it takes for computation to complete. Here is code that will generate two matrices of dimensions 300000,20000 and multiply them : import tensorflow as tf import numpy as np init = tf.global_variables_initializer() sess = tf.Session() sess.run(init) #a = np.array([[1, 2, 3], [4, 5, 6]]) #b

Error compiling CUDA from Command Prompt

阅读更多关于 Error compiling CUDA from Command Prompt

I'm trying to compile a cuda test program on Windows 7 via Command Prompt, I'm this command: nvcc test.cu But all I get is this error: nvcc fatal : Cannot find compiler 'cl.exe' in PATH What may be causing this error? You will need to add the folder containing the "cl.exe" file to your path environment variable. For example: C:\Program Files\Microsoft Visual Studio 10.0\VC\bin Edit : Ok, go to My Computer -> Properties -> Advanced System Settings -> Environment Variables. Here look for "PATH" in the list, and add the path above (or whatever is the location of your cl.exe). Solve this problem