cpu

CPU measures (Cache misses/hits) which do not make sense

流过昼夜 提交于 2019-12-08 09:06:36
问题 I use Intel PCM for fine-grained CPU measurements. In my code, I am trying to measure the cache efficiency. Basically, I first put a small array into the L1 cache (by traversing it many times), then I fire up the timer, go over the array one more time (which hopefully uses the cache), and then turning off the timer. PCM shows me that I have a rather high L2 and L3 miss ratio. I also checked with rdtscp and the cycles per array operation is 15 (which is much higher than 4-5 cycles for

Latency access times for L1 cache

旧街凉风 提交于 2019-12-08 07:21:06
问题 At this web link: http://www.7-cpu.com/cpu/IvyBridge.html it says the latency for Ivy Bridge L1 cache access is: L1 Data Cache Latency = 4 cycles for simple access via pointer L1 Data Cache Latency = 5 cycles for access with complex address calculation (size_t n, *p; n = p[n]). Instead of "simple", did they mean if the pointer size is the same as the word size? So if the pointer is 32-bit and its a 32-bit OS then this would be "simple", otherwise it would cost the "complex" latency? I just

How to know the values of CR registers from linux user and kernel modes

假装没事ソ 提交于 2019-12-08 07:02:35
问题 I would like to know the CR0-CR4 register values on x86. Can I write inline assembly to read it out? Are there any other methods? (e.g., does OS keep any file structures to record these values) 回答1: The Linux kernel has some function to read and write Control Registers, they are the read_crX and write_crX functions for the standard CR and xgetbv , xsetbv for the extended CR. User mode applications need a LKM to indirectly use these functions. In theory you just need to create a LKM with one

x86 and x64 share instruction set?

不打扰是莪最后的温柔 提交于 2019-12-08 06:49:35
问题 I don't know how 32bit application can run on a 64bit OS. My understanding is 32bit/64bit refers to register size. An instruction set should be different as they have different sizes of register. But I know there is x86-64 instruction set that is the 64bit version of the x86 instruction set. Is the reason we can run 32bit application on 64bit OS is because of the x86-64? If so, why are 32bit applications sometimes not compatible in 64bit windows? Why do we need WOW64? (Sometimes we are asked

How to check CPU name, model, speed on Windows/Linux C?

六眼飞鱼酱① 提交于 2019-12-08 06:40:32
问题 I would like to get some infos with C about hardware: how many CPU's I have how many cores have each of them how many logical cores have every core in every CPU CPU name + model CPU speed + frequency CPU architecture (x86, x64) I know that on Linux-like OS I can parse /proc/cpuinfo but since its not an ordinary file, I think its unsafe. Saw this answer on SO but it doesnt give me EVERY info I need. Should I call cat /proc/cpuinfo > file.txt and then parse file.txt ? I know about cpuid.h (Im

get apache total cpu usage in (linux)

回眸只為那壹抹淺笑 提交于 2019-12-08 06:24:00
问题 I want to write a script (in bash or Perl on linux) which monitors the Apache and restarts the Apache in case it exceeds X% CPU. I understand that I need to get the total CPU usage of Apache since it opens child process. How can I get the total CPU usage of Apache? 回答1: Try the following, but make sure to update the Apache-process name with your actual one (mine is httpd ): ps u -C httpd | awk '{sum += $3} END {print sum}' This will get a list of all apache processes running and sum the %CPU

Bind threads to specific CPU cores using OpenMP

隐身守侯 提交于 2019-12-08 06:06:59
问题 I know that GOMP_CPU_AFFINITY binds threads to specific cores. But in example what they have given here, it gives: GOMP_CPU_AFFINITY="0 3 2 1" Here, thread0 gets attached to---> cpu0 thread1 gets attached to---> cpu3 thread2 gets attached to---> cpu2 thread3 gets attached to---> cpu1 This is clear. But How can I set thread0 to core0 and core2 at same time ? What will be the value of Environment variable "GOMP_CPU_AFFINITY" for it ? 回答1: This GOMP reference may help you. To answer your

How do I simulate 100% CPU usage?

≡放荡痞女 提交于 2019-12-08 02:36:15
问题 To simulate 100% CPU usage, I placed an infinite while loop in my code ( while (true) { } ) . This seemed to spike the CPU usage up to 30% (ordinarily it is 2% for the same program that I run without the while loop). Why does it not go above 30%? This is a dual core Intel i7 processor. The app is a simple console app running c# code on .net 4.0 private static void SimulateCPUSpike() { while(true) { } } 回答1: CPU usage is a percentage of all CPU cores. If your code is only running a single

Are Intel x86_64 processors not only pipelined architecture, but also superscalar?

烂漫一生 提交于 2019-12-07 22:58:32
问题 Are Intel x86_64 processors not only pipelined architecture, but also superscalar? Pipelining - these two sequences execute in parallel (different stages of the same pipeline-unit in the same clock, for example ADD with 4 stages): stage1 -> stage2 -> stage3 -> stage4 -> nothing nothing -> stage1 -> stage2 -> stage3 -> stage4 Superscalar - these two sequences execute in parallel (two instructions can be launched to different pipeline-units in the same clock, for example ADD and MUL): ADD

How GetCurrentProcessorNumber() works? CPU core of a thread at runtime?

坚强是说给别人听的谎言 提交于 2019-12-07 22:26:13
问题 Is there any smooth way to find out the CPU core id of a thread running in a multithreading code during runtime? I tried to use GetCurrentProcessorNumber(), but it seems it is not giving the CPU core id where the individual threads are running. The code I have been using is: using System; using System.Threading; using System.Threading.Tasks; using System.Diagnostics; using System.Runtime.InteropServices; class S { [DllImport("kernel32.dll")] static extern int GetCurrentProcessorNumber();