gpu

Formula to determine if an infinite line and a line segment intersect?

▼魔方 西西 提交于 2019-12-25 09:32:07
问题 Given a point on a line and that line's slope how would one determine if the line, extending in each direction infinitely, intersects with a line segment (x1,y1) , (x2,y2) and, if so, the point at which the intersection occurs? I found this, but I'm unsure if it's helpful here. If someone wants to help me understand "rays", that's alright with me. http://www.realtimerendering.com/intersections.html I'm sorry that I'm an idiot. 回答1: Arbitrary point on the first line has parametric equation dx

Changing the GPU clock rate on a linux like system (Nvidia Jetson TX1)

爱⌒轻易说出口 提交于 2019-12-25 08:17:06
问题 I have a Nvidia Jetson tx1 board and want to change the gpu rate by writing in the following file: sudo echo 691200000 > /sys/kernel/debug/clock/override.gbus/rate sudo echo 1 > /sys/kernel/debug/clock/override.gbus/state However I get greeted with a Permission denied I know the commands from the following script https://github.com/dusty-nv/jetson-scripts/blob/master/jetson_max_l4t.sh which was proposed by an nvidia-employee. However I do not want to max out the gpu clock frequency, I want to

Cuda program does not give the correct output when using a CUDA compatible GPU

浪尽此生 提交于 2019-12-25 03:47:12
问题 I found the following program from http://llpanorama.wordpress.com/2008/05/21/my-first-cuda-program/ Unfortunately I can't copy paste it here because the code becomes messy It takes as input a vector of numbers and then gives as an output the vector multiplied by itself, I run it on the emulator that I have installed on my computer and it gives the following output: 0 0.000000 1 1.000000 2 4.000000 3 9.000000 4 16.000000 5 25.000000 6 36.000000 7 49.000000 8 64.000000 9 81.000000 however if I

Dot Product in CUDA using atomic operations - getting wrong results

人走茶凉 提交于 2019-12-25 02:56:14
问题 I am trying to implement the dot product in CUDA and compare the result with what MATLAB returns. My CUDA code (based on this tutorial) is the following: #include <stdio.h> #define N (2048 * 8) #define THREADS_PER_BLOCK 512 #define num_t float // The kernel - DOT PRODUCT __global__ void dot(num_t *a, num_t *b, num_t *c) { __shared__ num_t temp[THREADS_PER_BLOCK]; int index = threadIdx.x + blockIdx.x * blockDim.x; temp[threadIdx.x] = a[index] * b[index]; __syncthreads(); //Synchronize! *c = 0

Parallel execution is not performed OpenCL MQL5

偶尔善良 提交于 2019-12-25 01:06:28
问题 I have created a kernel of the OpenCL in Mql5. Here is the code: const string cl_src = //" int weightsum; \r\n" " #pragma OPENCL EXTENSION cl_khr_fp64 : enable \r\n" "__kernel void CalculateSimpleMA( \r\n" "int rates_total, \r\n" "int prev_calculated, \r\n" "int begin, \r\n" "int InpMAPeriod, \r\n" "__global double *price, \r\n" "__global double *ExtLineBuffer \r\n" ") \r\n" "{ \r\n" "int i,limit; \r\n" "int len_price = get_global_id(4); \r\n" //"int len_Ext = get_global_id(5); \r\n" " if

Writing a script to output Frame rate/dropped frame statistics on Android

 ̄綄美尐妖づ 提交于 2019-12-24 22:00:56
问题 I would like to analyze the maximum frame rate when I run a few applications such as games and videos. I'm pretty new to Android so I'm not sure where I should start. 回答1: Maybe this tool will help you: http://developer.android.com/tools/help/gltracer.html You need device with api level 16 or higher. 来源: https://stackoverflow.com/questions/22998238/writing-a-script-to-output-frame-rate-dropped-frame-statistics-on-android

Halide::Buffer on GPU

Deadly 提交于 2019-12-24 21:48:37
问题 I already have an application that takes input images, copies them to GPU, and then some CUDA filters are applied to that image. So, when I want to implement a new filter, I only write the filter itself (ie. kernel), since the CPU-GPU copying logic is already there. Now I want to try out Halide for writing image filters for CUDA, and I encounter a problem that Halide::Buffer, which represents input image, is allocated on CPU, so I would have to change my existing copying logic. Is there any

tensorflow on GPU doesn't work

情到浓时终转凉″ 提交于 2019-12-24 19:00:30
问题 I have the following code import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets('MNIST_data/', one_hot = True) def init_weights(shape): init_random_dist = tf.truncated_normal(shape, stddev = 0.1) return tf.Variable(init_random_dist) def init_bias(shape): init_bias_vals = tf.constant(0.1, shape = shape) return tf.Variable(init_bias_vals) def conv2d(x, W): return tf.nn.conv2d(x, W, strides = [1, 1, 1, 1], padding = 'SAME') def max

CUDA: Passing parameters to host compiler during Nsight session

被刻印的时光 ゝ 提交于 2019-12-24 15:34:28
问题 I have a cuda (v4.2) program running under visual studio 2010, to which I pass various command line parameters. I want the host compiler to see the same parameters when I run through nsight (v2.2). I assume I have to do this via (right click project) -> nsight user settings->command line arguments, but haven't yet managed to find a syntax that doesn't crash nvcc. I'm assuming it's arranged around "--run-args " somehow? ** Clarification, after comment below: Sure when you debug straight

OpenACC parallel kernels not getting generated

亡梦爱人 提交于 2019-12-24 15:09:20
问题 I am developing a code on PGC++ for graphically accelerating the code. I am using OpenBabel which has Eigen dependancy. I have tried using #pragma acc kernel I have tried using #pragma acc routine My compilation command is: "pgc++ -acc -ta=tesla -Minfo=all -I/home/pranav/new_installed/include/openbabel-2.0/ -I/home/pranav/new_installed/include/eigen3/ -L/home/pranav/new_installed/lib/openbabel/ main.cpp /home/pranav/new_installed/lib/libopenbabel.so" I am getting following error PGCC-S-0155