cpu

Cache bandwidth per tick for modern CPUs

馋奶兔 提交于 2019-11-28 23:25:22
What is a speed of cache accessing for modern CPUs? How many bytes can be read or written from memory every processor clock tick by Intel P4, Core2, Corei7, AMD? Please, answer with both theoretical (width of ld/sd unit with its throughput in uOPs/tick) and practical numbers (even memcpy speed tests, or STREAM benchmark), if any. PS it is question, related to maximal rate of load/store instructions in assembler. There can be theoretical rate of loading (all Instructions Per Tick are widest loads), but processor can give only part of such, a practical limit of loading. osgx For nehalem: rolfed

memcpy performance differences between 32 and 64 bit processes

烂漫一生 提交于 2019-11-28 21:43:13
We have Core2 machines (Dell T5400) with XP64. We observe that when running 32-bit processes, the performance of memcpy is on the order of 1.2GByte/s; however memcpy in a 64-bit process achieves about 2.2GByte/s (or 2.4GByte/s with the Intel compiler CRT's memcpy). While the initial reaction might be to just explain this away as due to the wider registers available in 64-bit code, we observe that our own memcpy-like SSE assembly code (which should be using 128-bit wide load-stores regardless of 32/64-bitness of the process) demonstrates similar upper limits on the copy bandwidth it achieves.

How to get CPU info in C on Linux, such as number of cores? [duplicate]

旧巷老猫 提交于 2019-11-28 21:42:47
This question already has an answer here: How to get the number of CPUs in Linux using C? 7 answers Is it possible to get such info by some API or function, rather than parsing the /proc/cpuinfo ? Kimvais From man 5 proc : /proc/cpuinfo This is a collection of CPU and system architecture dependent items, for each supported architecture a different list. Two common entries are processor which gives CPU number and bogomips; a system constant that is calculated during kernel initialization. SMP machines have information for each CPU. Here is sample code that reads and prints the info to console,

What is this “denormal data” about ? - C++

孤街醉人 提交于 2019-11-28 21:22:17
I would like to have a broad view about "denormal data" and what it's about because the only thing that I think I got right is the fact that is something especially related to floating point values from a programmer viewpoint and it's related to a general-computing approach from the CPU standpoint . Someone can decrypt this 2 words for me ? EDIT please remember that I'm oriented to C++ applications and only the C++ language. You ask about C++, but the specifics of floating-point values and encodings are determined by a floating-point specification, notably IEEE 754, and not by C++. IEEE 754 is

What does 'bank'ing a register mean?

我怕爱的太早我们不能终老 提交于 2019-11-28 21:09:10
Reading 'ARM Architecture' on Wikipedia and found the following statement: Registers R0-R7 are the same across all CPU modes; they are never banked. R13 and R14 are banked across all privileged CPU modes except system mode. What does banking a register mean? enjoylife Register banking refers to providing multiple copies of a register at the same address. Taken from section 1.4.6 of the arm docs The term is referring to a solution for the problem that not all registers can be seen at once. There is a different register bank for each processor mode. The banked registers give rapid context

Programmable USB dongles [closed]

£可爱£侵袭症+ 提交于 2019-11-28 19:16:02
Where can I buy a programmable USB dongle that supports C as a development language? Senselock rockey Aladdin We use Senselock in our application. It is a smart dongle, that you can download your custom code into it. The way it works is that instead of just checking the presence of a dongle, your code should expect a correct output from the dongle emitted from your code inside the dongle. One place to start is Hexwax. Try http://www.hexwax.com/Products/expandIO%2DUSB/ which will give you an idea of what you can do and where to start. These are firmwares for the PIC18 series of microcontrollers

CPU Cycle count based profiling in C/C++ Linux x86_64

牧云@^-^@ 提交于 2019-11-28 19:01:41
I am using the following code to profile my operations to optimize on cpu cycles taken in my functions. static __inline__ unsigned long GetCC(void) { unsigned a, d; asm volatile("rdtsc" : "=a" (a), "=d" (d)); return ((unsigned long)a) | (((unsigned long)d) << 32); } I don't think it is the best since even two consecutive calls gives me a difference of "33". Any suggestions ? I personally think the rdtsc instruction is great and usable for a variety of tasks. I do not think that using cpuid is necessary to prepare for rdtsc. Here is how I reason around rdtsc: Since I use the Watcom compiler I

How to enable support of CPU virtualization on Macbook Pro?

佐手、 提交于 2019-11-28 18:37:00
I have the VirtualBox installed on my Macbook Pro, and I want to install a linux VM on VirtualBox. When I launched the new VM, it prompts that "Your CPU does not support long mode. Use a 32bit distribution." After searching for this problem, I found that support of CPU virtualization is required for this VM. Then I checked on my Macbook and its CPU is i7 which supports virtualization. So I guess the problem is related to the OS or EFI version? OS version: 10.6.8 / EFI version: latest (check on apple.com) Does anyone know what's the problem of my Macbook? How can I enable the support of CPU

What are the common causes for high CPU usage?

空扰寡人 提交于 2019-11-28 18:24:36
Background: In my application written in C++, I have created 3 threads: AnalysisThread (or Producer) : it reads an input file, parses it, and generates patterns, and enqueue them into std::queue 1 . PatternIdRequestThread (or Consumer) : it deque patterns from the queue, and sends them, one by one, to database through a client (written in C++), which returns pattern uid which is then assigned to the corresponding pattern. ResultPersistenceThread : it does few more things, talks to database, and it works fine as expected, as far as CPU usage is concerned. First two threads take 60-80% of CPU

How to programmatically get the CPU cache page size in C++?

↘锁芯ラ 提交于 2019-11-28 17:54:55
I'd like my program to read the cache line size of the CPU it's running on in C++. I know that this can't be done portably, so I will need a solution for Linux and another for Windows (Solutions for other systems could be usefull to others, so post them if you know them). For Linux I could read the content of /proc/cpuinfo and parse the line begining with cache_alignment. Maybe there is a better way involving a call to an API. For Windows I simply have no idea. On Windows #include <Windows.h> #include <iostream> using std::cout; using std::endl; int main() { SYSTEM_INFO systemInfo;