gpu | 易学教程

Should I gray scale the image?

阅读更多关于 Should I gray scale the image?

问题 I'm categorizing 30 types of clothes from the image using R-CNN Object Detection Library from tensorflow : https://github.com/tensorflow/models/tree/master/research/object_detection Does color matter when we collect images for training and testing? If I put only purple and blue shirts, I guess it won't recognize red shirts? Should I gray scale all images to detect the types of clothes? :) 回答1: Yes, colour does matter. The underlying visual feature extraction is based on a convolutional neural

Simulating pipeline program with CUDA

阅读更多关于 Simulating pipeline program with CUDA

问题 Say I have two arrays A and B and a kernel1 that does some calculation on both arrays (vector addition for example) by breaking the arrays into different chunks and and writes the partial result to C . kernel1 then keeps doing this until all elements in the arrays are processed. unsigned int i = blockIdx.x*blockDim.x + threadIdx.x; unsigned int gridSize = blockDim.x*gridDim.x; //iterate through each chunk of gridSize in both A and B while (i < N) { C[i] = A[i] + B[i]; i += gridSize; } Say,

kubernetes scheduling for expensive resources

阅读更多关于 kubernetes scheduling for expensive resources

We have a Kubernetes cluster. Now we want to expand that with GPU nodes (so that would be the only nodes in the Kubernetes cluster that have GPUs). We'd like to avoid Kubernetes to schedule pods on those nodes unless they require GPUs. Not all of our pipelines can use GPUs. The absolute majority are still CPU-heavy only. The servers with GPUs could be very expensive (for example, Nvidia DGX could be as much as $150/k per server). If we just add DGX nodes to Kubernetes cluster, then Kubernetes would schedule non-GPU workloads there too, which would be a waste of resources (e.g. other jobs that

Installing cuDNN for Theano without root access

阅读更多关于 Installing cuDNN for Theano without root access

问题 Can I install cuDNN locally without root access ? I don't have root access to a linux machine I am using (the distro is openSuse), but I have CUDA 7.5 already installed. I am using Theano and I need cuDNN to improve the speed of the operations on the GPU. I downloaded cudnn-7.5-linux-x64-v5.1 from Nvidia and as per the instructions I need to copy the CuDNN archive content to CUDA installation folder, i.e. (cuda/lib64/ and cuda/include/). But that would require me to have root access. Is it

Why does the floatX's flag impact whether GPU is used in Theano?

阅读更多关于 Why does the floatX's flag impact whether GPU is used in Theano?

问题 I am testing Theano with GPU using the script provided in the tutorial for that purpose: # Start gpu_test.py # From http://deeplearning.net/software/theano/tutorial/using_gpu.html#using-gpu from theano import function, config, shared, sandbox import theano.tensor as T import numpy import time vlen = 10 * 30 * 768 # 10 x #cores x # threads per core iters = 1000 rng = numpy.random.RandomState(22) x = shared(numpy.asarray(rng.rand(vlen), config.floatX)) f = function([], T.exp(x)) print(f.maker

Use Vulkan VkImage as a CUDA cuArray

阅读更多关于 Use Vulkan VkImage as a CUDA cuArray

问题 What is the correct way of using a Vulkan VkImage as a CUDA cuArray? I've been trying to follow some examples, however I get a CUDA_ERROR_INVALID_VALUE on a call to cuExternalMemoryGetMappedMipmappedArray() To provide the information in an ordered way. I'm using CUDA 10.1 Base code comes from https://github.com/SaschaWillems/Vulkan, in particular I'm using the 01 - Vulkan Gears demo, enriched with the saveScreenshot method 09 - Capturing screenshots Instead of saving the snapshot image to a

Distributed tensorflow replicated training example: grpc_tensorflow_server - No such file or directory

阅读更多关于 Distributed tensorflow replicated training example: grpc_tensorflow_server - No such file or directory

问题 I am trying to make a distributed tensorflow implementation by following the instructions in this blog: Distributed TensorFlow by Leo K. Tam. My aim is to perform replicated training as mentioned in this post I have completed the steps till installing tensorflow and successfully running the following command and getting results: sudo bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu Now the next thing, which I want to implement is to launch the gRPC server on one of the nodes by the

How to force tensorflow to use all available GPUs?

阅读更多关于 How to force tensorflow to use all available GPUs?

问题 I have an 8 GPU cluster and when I run a piece of Tensorflow code (pasted below), it only utilizes a single GPU instead of all 8. I confirmed this using nvidia-smi . # Set some parameters IMG_WIDTH = 256 IMG_HEIGHT = 256 IMG_CHANNELS = 3 TRAIN_IM = './train_im/' TRAIN_MASK = './train_mask/' TEST_PATH = './test/' warnings.filterwarnings('ignore', category=UserWarning, module='skimage') num_training = len(os.listdir(TRAIN_IM)) num_test = len(os.listdir(TEST_PATH)) # Get and resize train images

Is the warmup code necessary when measuring CUDA kernel running time?

阅读更多关于 Is the warmup code necessary when measuring CUDA kernel running time?

问题 In page 85, professional CUDA C programming: int main() { ...... // run a warmup kernel to remove overhead size_t iStart,iElaps; cudaDeviceSynchronize(); iStart = seconds(); warmingup<<<grid, block>>> (d_C); cudaDeviceSynchronize(); iElaps = seconds() - iStart; printf("warmup <<< %4d %4d >>> elapsed %d sec \n",grid.x,block.x, iElaps ); // run kernel 1 iStart = seconds(); mathKernel1<<<grid, block>>>(d_C); cudaDeviceSynchronize(); iElaps = seconds() - iStart; printf("mathKernel1 <<< %4d %4d >>

How much registers per thread does OpenCL kernel use on Nvidia GPU?

阅读更多关于 How much registers per thread does OpenCL kernel use on Nvidia GPU?

问题 My First Question is How to get registers used information for OpenCL kernel code on Nvidia GPU, as nvcc complier gives the same using nvcc --ptxas-options=-v flag for CUDA kernel code. I also got the same information on AMD GPU for OpenCL kernel, from .isa file generated while running the program, after exporting GPU_DUMP_DEVICE_KERNEL=3 . Same thing i also tried on Nvidia GPU but it did not get .isa file . My second question is that why Nvidia GPU not generating .isa file ? After googling I