gpu

tensorflow-gpu on Windows: No module named '_pywrap_tensorflow_internal'

最后都变了- 提交于 2019-12-11 10:56:47
问题 I am trying to install Tensorflow with GPU support on Windows 10 according to the following guide: https://nitishmutha.github.io/tensorflow/2017/01/22/TensorFlow-with-gpu-for-windows.html However, I got the following error when I import tensorflow in Conda Python 3.5.2: How I fix this dll-not-found error? 'pip install tensorflow-gpu' did not give any errors. File "", line 666, in _load_unlocked File "", line 577, in module_from_spec File "", line 906, in create_module File "", line 222, in

keras multiple_gpu_model causes “Can't pickle module object” error

为君一笑 提交于 2019-12-11 10:28:16
问题 This is a follow up of this question. I am trying to utilize 8 GPUs for training and am using the multiple_gpu_model from Keras. I specified a batch size of 128 which will be split amongst the 8 GPUs resulting in 16 per GPU. Now, with this configuration, I get the following error: Train on 6120 samples, validate on 323 samples Epoch 1/100 6120/6120 [==============================] - 42s 7ms/step - loss: 0.0996 - mean_iou: 0.6919 - val_loss: 0.0969 - val_mean_iou: 0.7198 Epoch 00001: val_loss

gpu::morphologyEx is slower than morphologyEx in CPU?

我怕爱的太早我们不能终老 提交于 2019-12-11 10:22:56
问题 I am writing a c++ code for comparing the performance of morphologyEx method of opencv using the CPU and GPU versions. Here is my code: #include <opencv2/opencv.hpp> #include <opencv2/gpu/gpu.hpp> #include <sys/time.h> #include <ctime> using namespace cv; using namespace std; double start_timer() { double start_time = (double) getTickCount(); return start_time; } double end_timer(double start_time,int num_tests) { double time = (1000 * ((double) getTickCount() - start_time)/ getTickFrequency(

Copying a multi-branch tree to GPU memory

守給你的承諾、 提交于 2019-12-11 10:06:43
问题 i have a tree of nodes and i am trying to copy it to GPU memory. the Node looks like this: struct Node { char *Key; int ChildCount; Node *Children; } And my copy function looks like this: void CopyTreeToDevice(Node* node_s, Node* node_d) { //allocate node on device and copy host node cudaMalloc( (void**)&node_d, sizeof(Node)); cudaMemcpy(node_d, node_s, sizeof(Node), cudaMemcpyHostToDevice); //test printf("ChildCount of node_s looks to be : %d\n", node_s->ChildCount); printf("Key of node_s

nvEncodeApp successfully make but in running it : NVENC error at CNVEncoder.cpp:1282 code=15 ( invalid struct version was used ) “nvStatus”

[亡魂溺海] 提交于 2019-12-11 09:28:00
问题 I make nvEncodeApp successfully but when I run it my output is like this ./nvEncoder -infile=HeavyHandIdiot.3sec.yuv -outfile=outh.264 -width=1080 -height=1080 > NVEncode configuration parameters for Encoder[0] > GPU Device ID = 0 > Input File = HeavyHandIdiot.3sec.yuv > Output File = outh.264 > Frames [000--01] = 0 frames > Multi-View Codec = No > Width,Height = [1080,1080] > Video Output Codec = 4 - H.264 Codec > Average Bitrate = 0 (bps/sec) > Peak Bitrate = 0 (bps/sec) > BufferSize = 0 >

assembling a matrix from diagonal slices with mclapply or %dopar%, like Matrix::bandSparse

╄→尐↘猪︶ㄣ 提交于 2019-12-11 08:37:12
问题 Right now I'm working with some huge matrices in R and I need to be able to reassemble them using diagonal bands. For programming reasons (to avoid having to do n*n operations for a matrix of size n (millions of calculations), I wanted to just do 2n calculations (thousands of calculations) and thus chose to do run my function on the diagonal bands of the matrix. Now, I have the results, but need to take these matrix slices and assemble them in a way that allows me to use multiple processors.

Different Image Block Sizes Using the GPU

拈花ヽ惹草 提交于 2019-12-11 07:53:28
问题 I wish to apply filter motion for certain number of iteration on different images, each image will be divided into different block size. For example, if the image size is 1024x870 ,how to divide this image into different block sizes 8x8 , 16x16 , 64x64 , etc. using MATLAB? 回答1: It's not perfect but I would do: A=rand(128); Apatch=im2col(A,[64 64],'distinct'); Apacth=gpuArray(Apatch); Otherwise you can try (I am not sure it speeds up): A=rand(128); A=gpuArray(A); Apatch=im2col(A,[64 64],

ERROR (theano.gpuarray): Could not initialize pygpu, support disabled

瘦欲@ 提交于 2019-12-11 07:38:05
问题 I am trying to configure theano 0.9 to use gpu, but got such error. I use windows 10 with nvidia GeForce 940m and cuda 8. Previously my system works fine with theano 0.8 for gpu computation. I just updated the theano. ERROR (theano.gpuarray): Could not initialize pygpu, support disabled Traceback (most recent call last): File "C:\Users\YL\Anaconda2\lib\site- packages\theano\gpuarray\__init__.py", line 175, in <module> use(config.device) File "C:\Users\YL\Anaconda2\lib\site-packages\theano

gputools: error in installation

霸气de小男生 提交于 2019-12-11 07:30:26
问题 I am setting up a new Dell Precision workstation with an NVidia Tesla 2050 GPU card. I would like to install R's package gputools. My OS is openSuse 11.3 with KDE 4.4. I downloaded NVidia's CUDA Toolkit 3.2 and installed it in /usr/local/cuda, I also downloaded the latest version of the CULA Tools set (version R10) and installed it in /usr/local/cula. When trying to install gputools from within R using: install.packages("gputools") I get the following error message: classification.cu(735):

How to take CPU memory(UCHAR Buffer) in to GPU memory(ID3D11Texture2D Resource)

心已入冬 提交于 2019-12-11 07:09:22
问题 The code here will run in GPU and capture windows screen, it give us ID3D11Texture2D Resource. Using ID3D11DeviceContext::Map I taking GPU resource in to BYTE buffer from BYTE buffer in to CPU Memory g_iMageBuffer its a UCHAR . Now I want to do reverse engineering, I want to take g_iMageBuffer buffer(CPU Memory) in to ID3D11Texture2D (GPU memory). Please someone help me how to do this reverse engineering I am new to graphical part. //Variable Declaration IDXGIOutputDuplication* IDeskDupl;