nvidia | 易学教程

How to do stereoscopic 3D with OpenGL on GTX 560 and later?

阅读更多关于 How to do stereoscopic 3D with OpenGL on GTX 560 and later?

I am using the open source haptics and 3D graphics library Chai3D running on Windows 7. I have rewritten the library to do stereoscopic 3D with Nvidia nvision. I am using OpenGL with GLUT, and using glutInitDisplayMode(GLUT_RGB | GLUT_DEPTH | GLUT_DOUBLE | GLUT_STEREO) to initialize the display mode. It works great on Quadro cards, but on GTX 560m and GTX 580 cards it says the pixel format is unsupported. I know the monitors are capable of displaying the 3D, and I know the cards are capable of rendering it. I have tried adjusting the resolution of the screen and everything else I can think of,

NVIDIA Optimus card not switching under OpenGL

阅读更多关于 NVIDIA Optimus card not switching under OpenGL

When I used use "glGetString(GL_VERSION)" and "glGetString(GL_SHADING_LANGUAGE_VERSION)" to check the OpenGL version on my computer, I got the following information: 3.1.0 - Build 8.15.10.2538 for GL_VERSION 1.40 - Intel Build 8.15.10.2538 for GL_SHADING_LANGUAGE_VERSION When I ran "Geeks3D GPU Caps Viewer", it shown the OpenGL version of my graphics cards(NVS 4200M) are GL_VERSION: 4.3.0 GLSL version: 4.30 NVIDIA via Cg compiler Does that mean my graphics cards only supports some OpenGL 4.3.0 functions, and I cannot create 4.3 context? Your graphics card is an NVIDIA Optimus card. This means

Rationalizing what is going on in my simple OpenCL kernel in regards to global memory

阅读更多关于 Rationalizing what is going on in my simple OpenCL kernel in regards to global memory

const char programSource[] = "__kernel void vecAdd(__global int *a, __global int *b, __global int *c)" "{" " int gid = get_global_id(0);" "for(int i=0; i<10; i++){" " a[gid] = b[gid] + c[gid];}" "}"; The kernel above is a vector addition done ten times per loop. I have used the programming guide and stack overflow to figure out how global memory works, but I still can't figure out by looking at my code if I am accessing global memory in a good way. I am accessing it in a contiguous fashion and I am guessing in an aligned way. Does the card load 128kb chunks of global memory for arrays a, b,

wglCreateContextAttribsARB fails on NVIDIA Hardware

阅读更多关于 wglCreateContextAttribsARB fails on NVIDIA Hardware

ContextWin32::ContextWin32(WindowHandle parent, NLOpenGLSettings settings) : IPlatformContext(parent, settings) { int pf = 0; PIXELFORMATDESCRIPTOR pfd = {0}; OSVERSIONINFO osvi = {0}; osvi.dwOSVersionInfoSize = sizeof(OSVERSIONINFO); // Obtain HDC for this window. if (!(m_hdc = GetDC((HWND)parent))) { NLError("[ContextWin32] GetDC() failed."); throw NLException("GetDC() failed.", true); } // Create and set a pixel format for the window. pfd.nSize = sizeof(pfd); pfd.nVersion = 1; pfd.dwFlags = PFD_DRAW_TO_WINDOW | PFD_SUPPORT_OPENGL | PFD_DOUBLEBUFFER; pfd.iPixelType = PFD_TYPE_RGBA; pfd

What instruction set does the Nvidia GeForce 6xx Series use?

阅读更多关于 What instruction set does the Nvidia GeForce 6xx Series use?

Does the GeForce 6xx Series GPUS use RISC, CISC or VLIW style instructions? In one source, at http://www.motherboardpoint.com/risc-cisc-t241234.html someone said "GPUs are probably closer to VLIW than to RISC or CISC" . In another source, at http://en.wikipedia.org/wiki/Very_long_instruction_word#implementations it says "both Nvidia and AMD have since moved to RISC architectures in order to improve performance on non-graphics workload" AFAIK, Nvidia does not publicly document it's hardware instruction sets. The best you can see officially is PTX ISA which is the instruction set of a virtual

Dearth of CUDA 5 Dynamic Parallelism Examples

阅读更多关于 Dearth of CUDA 5 Dynamic Parallelism Examples

I've been googling around and have only been able to find a trivial example of the new dynamic parallelism in Compute Capability 3.0 in one of their Tech Briefs linked from here . I'm aware that the HPC-specific cards probably won't be available until this time next year (after the nat'l labs get theirs). And yes, I realize that the simple example they gave is enough to get you going, but the more the merrier . Are there other examples I've missed? To save you the trouble, here is the entire example given in the tech brief: __global__ ChildKernel(void* data){ //Operate on data } __global__

nVidia SLI Tricks [closed]

阅读更多关于 nVidia SLI Tricks [closed]

Closed . This question is opinion-based . It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts and citations by editing this post . Closed 2 months ago . I'm optimizing a directx graphics application to take advantage of nVidia's SLI technology. I'm currently investigating some of the techniques mentioned in their 'Best Practices' web page, but wanted to know what advice/experience any of you have had with this? Thanks! This is not really an answer to you question, more of a comment on SLI. My understanding is that SLI is

OpenCL compile on linux

阅读更多关于 OpenCL compile on linux

I'm a newbie in OpenCL. From yesterday, I'm trying to use OpenCL for parallel programming instead of CUDA which is more familiar for me and experienced before. Now I have NVIDIA GTX 580 GPU, Ubuntu Linux 12.04 OS and CUDA SDK 4.1 (already installed before because of CUDA programming). In CUDA SDK folder, Some OpenCL header file and library are already included. So I just downloaded OpenCL examples from NVIDIA's Developer zone. (Here is the link! https://developer.nvidia.com/opencl ) And I'm tried to compile some example by myself, but I couldn't. I make Makefile by using -I I added path of

Misaligned address in CUDA

阅读更多关于 Misaligned address in CUDA

问题 Can anyone tell me whats wrong with the following code inside a CUDA kernel: __constant__ unsigned char MT[256] = { 0xde, 0x6f, 0x6f, 0xb1, 0xde, 0x6f, 0x6f, 0xb1, 0x91, 0xc5, 0xc5, 0x54, 0x91, 0xc5, 0xc5, 0x54,....}; typedef unsinged int U32; __global__ void Kernel (unsigned int *PT, unsigned int *CT, unsigned int *rk) { long int i; __shared__ unsigned char sh_MT[256]; for (i = 0; i < 64; i += 4) ((U32*)sh_MT)[threadIdx.x + i] = ((U32*)MT)[threadIdx.x + i]; __shared__ unsigned int sh_rkey[4]

Dynamic Allocation of Constant memory in CUDA

阅读更多关于 Dynamic Allocation of Constant memory in CUDA

I'm trying to take advantage of the constant memory, but I'm having a hard time figuring out how to nest arrays. What I have is an array of data that has counts for internal data but those are different for each entry. So based around the following simplified code I have two problems. First I don't know how to allocate the data pointed to by the members of my data structure. Second, since I can't use cudaGetSymbolAddress for constant memory I'm not sure if I can just pass the global pointer (which you cannot do with plain __device__ memory). struct __align(16)__ data{ int nFiles; int nNames;