cpu | 易学教程

CPU measures (Cache misses/hits) which do not make sense

阅读更多关于 CPU measures (Cache misses/hits) which do not make sense

问题 I use Intel PCM for fine-grained CPU measurements. In my code, I am trying to measure the cache efficiency. Basically, I first put a small array into the L1 cache (by traversing it many times), then I fire up the timer, go over the array one more time (which hopefully uses the cache), and then turning off the timer. PCM shows me that I have a rather high L2 and L3 miss ratio. I also checked with rdtscp and the cycles per array operation is 15 (which is much higher than 4-5 cycles for

Latency access times for L1 cache

阅读更多关于 Latency access times for L1 cache

问题 At this web link: http://www.7-cpu.com/cpu/IvyBridge.html it says the latency for Ivy Bridge L1 cache access is: L1 Data Cache Latency = 4 cycles for simple access via pointer L1 Data Cache Latency = 5 cycles for access with complex address calculation (size_t n, *p; n = p[n]). Instead of "simple", did they mean if the pointer size is the same as the word size? So if the pointer is 32-bit and its a 32-bit OS then this would be "simple", otherwise it would cost the "complex" latency? I just

How to know the values of CR registers from linux user and kernel modes

阅读更多关于 How to know the values of CR registers from linux user and kernel modes

问题 I would like to know the CR0-CR4 register values on x86. Can I write inline assembly to read it out? Are there any other methods? (e.g., does OS keep any file structures to record these values) 回答1: The Linux kernel has some function to read and write Control Registers, they are the read_crX and write_crX functions for the standard CR and xgetbv , xsetbv for the extended CR. User mode applications need a LKM to indirectly use these functions. In theory you just need to create a LKM with one

x86 and x64 share instruction set?

阅读更多关于 x86 and x64 share instruction set?

问题 I don't know how 32bit application can run on a 64bit OS. My understanding is 32bit/64bit refers to register size. An instruction set should be different as they have different sizes of register. But I know there is x86-64 instruction set that is the 64bit version of the x86 instruction set. Is the reason we can run 32bit application on 64bit OS is because of the x86-64? If so, why are 32bit applications sometimes not compatible in 64bit windows? Why do we need WOW64? (Sometimes we are asked

How to check CPU name, model, speed on Windows/Linux C?

阅读更多关于 How to check CPU name, model, speed on Windows/Linux C?

问题 I would like to get some infos with C about hardware: how many CPU's I have how many cores have each of them how many logical cores have every core in every CPU CPU name + model CPU speed + frequency CPU architecture (x86, x64) I know that on Linux-like OS I can parse /proc/cpuinfo but since its not an ordinary file, I think its unsafe. Saw this answer on SO but it doesnt give me EVERY info I need. Should I call cat /proc/cpuinfo > file.txt and then parse file.txt ? I know about cpuid.h (Im

get apache total cpu usage in (linux)

阅读更多关于 get apache total cpu usage in (linux)

问题 I want to write a script (in bash or Perl on linux) which monitors the Apache and restarts the Apache in case it exceeds X% CPU. I understand that I need to get the total CPU usage of Apache since it opens child process. How can I get the total CPU usage of Apache? 回答1: Try the following, but make sure to update the Apache-process name with your actual one (mine is httpd ): ps u -C httpd | awk '{sum += $3} END {print sum}' This will get a list of all apache processes running and sum the %CPU

Bind threads to specific CPU cores using OpenMP

阅读更多关于 Bind threads to specific CPU cores using OpenMP

问题 I know that GOMP_CPU_AFFINITY binds threads to specific cores. But in example what they have given here, it gives: GOMP_CPU_AFFINITY="0 3 2 1" Here, thread0 gets attached to---> cpu0 thread1 gets attached to---> cpu3 thread2 gets attached to---> cpu2 thread3 gets attached to---> cpu1 This is clear. But How can I set thread0 to core0 and core2 at same time ? What will be the value of Environment variable "GOMP_CPU_AFFINITY" for it ? 回答1: This GOMP reference may help you. To answer your

How do I simulate 100% CPU usage?

阅读更多关于 How do I simulate 100% CPU usage?

问题 To simulate 100% CPU usage, I placed an infinite while loop in my code ( while (true) { } ) . This seemed to spike the CPU usage up to 30% (ordinarily it is 2% for the same program that I run without the while loop). Why does it not go above 30%? This is a dual core Intel i7 processor. The app is a simple console app running c# code on .net 4.0 private static void SimulateCPUSpike() { while(true) { } } 回答1: CPU usage is a percentage of all CPU cores. If your code is only running a single

Are Intel x86_64 processors not only pipelined architecture, but also superscalar?

阅读更多关于 Are Intel x86_64 processors not only pipelined architecture, but also superscalar?

问题 Are Intel x86_64 processors not only pipelined architecture, but also superscalar? Pipelining - these two sequences execute in parallel (different stages of the same pipeline-unit in the same clock, for example ADD with 4 stages): stage1 -> stage2 -> stage3 -> stage4 -> nothing nothing -> stage1 -> stage2 -> stage3 -> stage4 Superscalar - these two sequences execute in parallel (two instructions can be launched to different pipeline-units in the same clock, for example ADD and MUL): ADD

How GetCurrentProcessorNumber() works? CPU core of a thread at runtime?

阅读更多关于 How GetCurrentProcessorNumber() works? CPU core of a thread at runtime?

问题 Is there any smooth way to find out the CPU core id of a thread running in a multithreading code during runtime? I tried to use GetCurrentProcessorNumber(), but it seems it is not giving the CPU core id where the individual threads are running. The code I have been using is: using System; using System.Threading; using System.Threading.Tasks; using System.Diagnostics; using System.Runtime.InteropServices; class S { [DllImport("kernel32.dll")] static extern int GetCurrentProcessorNumber();