cpu | 易学教程

List of OpenCL compliant CPU/GPU

阅读更多关于 List of OpenCL compliant CPU/GPU

问题 How can I know which CPU can be programmed by OpenCL? For example, the Pentium E5200. Is there a way to know w/o running and querying it? 回答1: OpenCL compatibility can generally be determined by looking on the vendor's sites. AMD's APP SDK requires CPUs to support at least SSE2. They also have a list of currently supported ATI/AMD video cards. The most official source is probably the Khronos conformance list: http://www.khronos.org/conformance/adopters/conformant-products#opencl For

What is the difference between CUDA core and CPU core?

阅读更多关于 What is the difference between CUDA core and CPU core?

问题 I worked a bit with CUDA, and a lot with the CPU, and i'm trying to understand what is the difference between the two. My I5 processor has 4 cores and cost $200 and my NVidia 660 has 960 cores and cost about the same. I would be really happy if someone could explain what are the key differences between the two processing units architecture in terms of abilities pros and cons. For example, does a CUDA core have branch prediction? 回答1: It is a computer Architecture question which entails a long

Unable to start Genymotion virtual device, incompatible CPU

阅读更多关于 Unable to start Genymotion virtual device, incompatible CPU

问题 The first time I ran Genymotion virtual device, it had worked. But when I tried running it today, I got this error message What may have changed from the last time I used it? Will be grateful for any solutions that I can get. Thanks. My Device Info: Dell XPS L502X Sandy Bridge motherboard Intel Core i5-2410M @2.30 GHz Windows 7 Professional 64-bit 回答1: You need to turn virtualization on. Reboot the notebook. Instantly press F10 to enter BIOS settings (or F2 depending on your PC) Check the

CPU instructions not compiled with TensorFlow

阅读更多关于 CPU instructions not compiled with TensorFlow

问题 MacBook Air: OSX El Capitan When I run TensorFlow code in terminal ( python 3 tfpractice.py ), I get a longer than normal waiting time to get back output followed by these error messages: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2

How to determine CPU and memory consumption from inside a process?

阅读更多关于 How to determine CPU and memory consumption from inside a process?

问题 I once had the task of determining the following performance parameters from inside a running application: Total virtual memory available Virtual memory currently used Virtual memory currently used by my process Total RAM available RAM currently used RAM currently used by my process % CPU currently used % CPU currently used by my process The code had to run on Windows and Linux. Even though this seems to be a standard task, finding the necessary information in the manuals (WIN32 API, GNU docs

Why does CPU access memory on a word boundary?

阅读更多关于 Why does CPU access memory on a word boundary?

问题 I heard a lot that data should be properly aligned in memory for better access efficiency. CPU access memory on a word boundary. So in the following scenario, the CPU has to make 2 memory accesses to get a single word. Supposing: 1 word = 4 bytes ("|" stands for word boundary. "o" stands for byte boundary) |----o----o----o----|----o----o----o----| (The word boundary in CPU's eye) ----o----o----o---- (What I want to read from memory) Why should this happen? What's the root cause of the CPU can

The difference between Call Gate, Interrupt Gate, Trap Gate?

阅读更多关于 The difference between Call Gate, Interrupt Gate, Trap Gate?

问题 I am studying Intel Protected Mode. I found that Call Gate, Interrupt Gate, Trap Gate are almost the same. In fact, besides that Call Gate has the fields for parameter counter, and that these 3 gates have different type fields, they are identical in all other fields. As to their functions, they are all used to transfer code control into some procedure within some code segment. I am wondering, since these 3 gates all contain the information needed for the call across privilege boundaries. Why

What is the “relationship” between addi and subi?

阅读更多关于 What is the “relationship” between addi and subi?

问题 I'm supposed to answer this question. After some research it says that add and sub have the same opcode and differ only in the functional field. Is this the answer or something else? Update It's available in the Nios II CPU manual: subi subtract immediate Operation: rB ← rA – σ (IMMED) Assembler Syntax: subi rB, rA, IMMED Example: subi r8, r8, 4 Description: Sign-extends the immediate value IMMED to 32 bits, subtracts it from the value of rA and then stores the result in rB. Usage: The

What is the “relationship” between addi and subi?

阅读更多关于 What is the “relationship” between addi and subi?

How does Linux perf calculate the cache-references and cache-misses events

阅读更多关于 How does Linux perf calculate the cache-references and cache-misses events

问题 I am confused by the perf events cache-misses and L1-icache-load-misses,L1-dcache-load-misses,LLC-load-misses . As when I tried to perf stat all of them, the answer doesn't seem consistent: %$: sudo perf stat -B -e cache-references,cache-misses,cycles,instructions,branches,faults,migrations,L1-dcache-load-misses,L1-dcache-loads,L1-dcache-stores,L1-icache-load-misses,LLC-loads,LLC-load-misses,LLC-stores,LLC-store-misses,LLC-prefetches ./my_app 523,288,816 cache-references (22.89%) 205,331,370