cpu | 易学教程

Cache or Registers - which is faster?

阅读更多关于 Cache or Registers - which is faster?

问题 I'm sorry if this is the wrong place to ask this but I've searched and always found different answer. My question is: Which is faster? Cache or CPU Registers? According to me, the registers are what directly load data to execute it while the cache is just a storage place close or internally in the CPU. Here are the sources I found that confuses me: 2 for cache | 1 for registers http://in.answers.yahoo.com/question/index?qid=20110503030537AAzmDGp Cache is faster. http://wiki.answers.com/Q/Is

Which is faster: x<<1 or x<<10?

阅读更多关于 Which is faster: x

I don't want to optimize anything, I swear, I just want to ask this question out of curiosity. I know that on most hardware there's an assembly command of bit-shift (e.g. shl , shr ), which is a single command. But does it matter (nanosecond-wise, or CPU-tact-wise) how many bits you shift. In other words, is either of the following faster on any CPU? x << 1; and x << 10; And please don't hate me for this question. :) Potentially depends on the CPU. However, all modern CPUs (x86, ARM) use a "barrel shifter" -- a hardware module specifically designed to perform arbitrary shifts in constant time.

Limiting certain processes to CPU % - Linux

阅读更多关于 Limiting certain processes to CPU % - Linux

I have the following problem: some processes, generated dynamically, have a tendency to eat 100% of CPU. I would like to limit all the process matching some criterion (e.g. process name) to a certain amount of CPU percentage. The specific problem I'm trying to solve is harnessing folding@home worker processes. The best solution I could think of is a perl script that's executed periodically and uses the cpulimit utility to limit the processes (if you're interested in more details, check this blog post ). It works, but it's a hack :/ Any ideas? I would like to leave the handling of processes to

Is integer multiplication really done at the same speed as addition on a modern CPU?

阅读更多关于 Is integer multiplication really done at the same speed as addition on a modern CPU?

问题 I hear this statement quite often, that multiplication on modern hardware is so optimized that it actually is at the same speed as addition. Is that true? I never can get any authoritative confirmation. My own research only adds questions. The speed tests usually show data that confuses me. Here is an example: #include <stdio.h> #include <sys/time.h> unsigned int time1000() { timeval val; gettimeofday(&val, 0); val.tv_sec &= 0xffff; return val.tv_sec * 1000 + val.tv_usec / 1000; } int main()

Can a C# program measure its own CPU usage somehow?

阅读更多关于 Can a C# program measure its own CPU usage somehow?

I am working on a background program that will be running for a long time, and I have a external logging program ( SmartInspect ) that I want to feed with some values periodically, to monitor it in realtime when debugging. I know I can simply fire up multiple programs, like the Task Manager, or IARSN TaskInfo, but I'd like to keep everything in my own program for this, as I also wants to add some simple rules like if the program uses more than X% CPU, flag this in the log. I have a background thread that periodically feeds some statistics to SmartInspect, like memory consumption, working set,

How is CPU usage calculated?

阅读更多关于 How is CPU usage calculated?

On my desktop, I have a little widget that tells me my current CPU usage. It also shows the usage for each of my two cores. I always wondered, how does the CPU calculate how much of its processing power is being used? Also, if the CPU is hung up doing some intense calculations, how can it (or whatever handles this activity) examine the usage, without getting hung up as well? In silico The CPU doesn't do the usage calculations by itself. It may have hardware features to make that task easier, but it's mostly the job of the operating system. So obviously the details of implementations will vary

What's the difference between conflict miss and capacity miss

阅读更多关于 What's the difference between conflict miss and capacity miss

Capacity miss occurs because blocks are being discarded from cache because cache cannot contain all blocks needed for program execution (program working set is much larger than cache capacity). Conflict miss occurs in the case of set associative or direct mapped block placement strategies, conflict misses occur when several blocks are mapped to the same set or block frame; also called collision misses or interference misses. Are they actually very closely related? For example, if all the cache lines are filled and we have a read request for memory B, for which we have to evict memory A. So

What is a cache hit and a cache miss? Why would context-switching cause cache miss?

阅读更多关于 What is a cache hit and a cache miss? Why would context-switching cause cache miss?

问题 From the 11th Chapter( Performance and Scalability ) and the section named Context Switching of the JCIP book: When a new thread is switched in, the data it needs is unlikely to be in the local processor cache, so a context-switch causes a flurry of cache misses, and thus threads run a little more slowly when they are first scheduled. Can someone explain in an easy to understand way the concept of cache miss and its probable opposite ( cache hit )? Why context-switching would cause a lot of

Which Java thread is hogging the CPU?

阅读更多关于 Which Java thread is hogging the CPU?

Let's say your Java program is taking 100% CPU. It has 50 threads. You need to find which thread is guilty. I have not found a tool that can help. Currently I use the following very time consuming routine: Run jstack <pid> , where pid is the process id of a Java process. The easy way to find it is to run another utility included in the JDK - jps . It is better to redirect jstack's output to a file. Search for "runnable" threads. Skip those that wait on a socket (for some reason they are still marked runnable). Repeat steps 1 and 2 a couple of times and see if you can locate a pattern.

Good resources on how to program PEBS (Precise event based sampling) counters?

阅读更多关于 Good resources on how to program PEBS (Precise event based sampling) counters?

I have been trying to log all memory accesses of a program, which as I read seems to be impossible. I have been trying to see to what extent can I go to log atleast a major portion of the memory accesses, if not all. So I was looking to program the PEBS counters in such a way that I could see changes in the number of memory access samples collected. I wanted to know if I can do this by modifying the counter-reset value of PEBS counters. (Usually this goes to zero, but I want to set it to a higher value) So I was looking to program these pebs counters on my own. Has anybody had experience