gpu

GPU - System memory mapping

谁说胖子不能爱 提交于 2020-04-07 15:44:09
问题 How system memory (RAM) is mapped for GPU access? I am clear about how virtual memory works for cpu but am not sure how would that work when GPU accesses GPU-mapped system memory (host). Basically something related to how Data is copied from system-memory to host-memory and vice versa. Can you provide explanations backed by reference articles please? 回答1: I found the following slideset quite useful: http://developer.amd.com/afds/assets/presentations/1004_final.pdf MEMORY SYSTEM ON FUSION APUS

GPU - System memory mapping

最后都变了- 提交于 2020-04-07 15:43:29
问题 How system memory (RAM) is mapped for GPU access? I am clear about how virtual memory works for cpu but am not sure how would that work when GPU accesses GPU-mapped system memory (host). Basically something related to how Data is copied from system-memory to host-memory and vice versa. Can you provide explanations backed by reference articles please? 回答1: I found the following slideset quite useful: http://developer.amd.com/afds/assets/presentations/1004_final.pdf MEMORY SYSTEM ON FUSION APUS

Different timing indicated from two kind of timers

和自甴很熟 提交于 2020-03-25 16:37:34
问题 I’m trying to use two kind of timers to measure the run time of a GPU kernel. As the code indicated below, I have cudaEventRecord measuring the overall kernel and inside the kernel I have clock() functions. However, the output of the code shows that two timers got different measurements: gpu freq = 1530000 khz Hello from block 0, thread 0 kernel runtime: 0.0002453 seconds kernel cycle: 68194 According to results, the kernel elapsed 68194 clock cycles, the corresponded time should be 68194

CLSID_D2D1ChromaKey issues

久未见 提交于 2020-03-24 14:14:21
问题 I try to use DirectX ChromaKey effect, but my function stucks on some step. What I do: Create ID2D1Factory1 Create ID3D11Device and ID3D11DeviceContext Obtain DXGIResource from received texture Obtain shared handle from DXGIResource Open DXGIResource as new ID3D11Texture2D using ID3D11Device Obtain D3D11_TEXTURE2D_DESC of new ID3D11Texture2D Create new ID3D11Texture2D using input ID3D11Texture2D and D3D11_TEXTURE2D_DESC Copy resource from obtained ID3D11Texture2D to created ID3D11Texture2D

ubuntu 下给thinkpad T60P GPU降温

*爱你&永不变心* 提交于 2020-03-01 10:27:27
T60P 15寸 高分辨率(1600x1200)独立专业显卡 ATI FireGL V5200 Ubuntu下,这个GPU发热非常大,出风口都可以烤肉了,放在腿上简直有烧伤的可能。 想给gpu降降频。google之后找到如下办法。 首先,su为root(以下操作均需要root权限) sudo su 然后执行 echo mid > /sys/class/drm/card0/device/power_profile 本想用low,以更低的频率运行,结果花屏了,最后改成mid才可以。 查看结果 mount none /sys/kernel/debug/ -t debugfs cat /sys/kernel/debug/dri/0/radeon_pm_info 显示如下结果(GPU半速运行) default engine clock: 400000 kHz current engine clock: 209250 kHz default memory clock: 330000 kHz current memory clock: 135000 kHz PCIE lanes: 1 经过几分钟后,GPU区域温度明显降低。出风口温度也降低了不少。 开机后自动执行此操作。 vi /etc/rc.local 添加如下代码: echo mid > /sys/class/drm/card0/device

Get statistics for a list of numbers using GPU

心不动则不痛 提交于 2020-02-04 09:25:06
问题 I have several lists of numbers on a file . For example, .333, .324, .123 , .543, .00054 .2243, .333, .53343 , .4434 Now, I want to get the number of times each number occurs using the GPU. I believe this will be faster to do on the GPU than the CPU because each thread can process one list. What data structure should I use on the GPU to easily get the above counts. For example , for the above, the answer will look as follows: .333 = 2 times in entire file .324 = 1 time etc.. I looking for a

How can warps in the same block diverge

自古美人都是妖i 提交于 2020-02-02 12:55:47
问题 I am a bit confused how it is possible that Warps diverge and need to be synchronized via __syncthreads() function. All elements in a Block handle the same code in a SIMT fashion. How could it be that they are not in sync? Is it related to the scheduler? Do the different warps get different computing times? And why is there an overhead when using __syncthreads() ? Lets say we have 12 different Warps in a block 3 of them have finished their work. So now there are idling and the other warps get

get the CUDA and CUDNN version on windows with Anaconda installe

本秂侑毒 提交于 2020-02-01 05:46:52
问题 There is a tensorflow-gpu version installed on Windows using Anaconda, how to check the CUDA and CUDNN version of it? Thanks. 回答1: You could also run conda list from the anaconda command line: conda list cudnn # packages in environment at C:\Anaconda2: # # Name Version Build Channel cudnn 6.0 0 回答2: Although not a public documented API, you can currently access it like this: from tensorflow.python.platform import build_info as tf_build_info print(tf_build_info.cuda_version_number) # 9.0 in v1

get the CUDA and CUDNN version on windows with Anaconda installe

和自甴很熟 提交于 2020-02-01 05:46:08
问题 There is a tensorflow-gpu version installed on Windows using Anaconda, how to check the CUDA and CUDNN version of it? Thanks. 回答1: You could also run conda list from the anaconda command line: conda list cudnn # packages in environment at C:\Anaconda2: # # Name Version Build Channel cudnn 6.0 0 回答2: Although not a public documented API, you can currently access it like this: from tensorflow.python.platform import build_info as tf_build_info print(tf_build_info.cuda_version_number) # 9.0 in v1

OpenCV on iOS - GPU usage?

…衆ロ難τιáo~ 提交于 2020-01-31 03:47:05
问题 I am trying to develop an iOS app that performs real time effects on video from the camera, much like Photobooth on the iPad. I am familiar with the API for OpenCV but am worried about the performance on iOS if most processing is completed on the CPU versus the GPU. Libraries like GPUImage would most likely do the trick but I would rather stay with something I am familiar with. So, does anyone know if OpenCV compiled for iOS uses the GPU? 回答1: OpenCV uses Cuda for it's GPU which is only