intel

Can constant non-invariant tsc change frequency across cpu states?

不问归期 提交于 2020-08-20 03:44:31
问题 I used to benchmark Linux System Calls with rdtsc to get the counter difference before and after the system call. I interpreted the result as wall clock timer since TSC increments at constant rate and does not stop when entering halt state. The Invariant TSC concept is described as The invariant TSC will run at a constant rate in all ACPI P-, C-. and T-states. Can a constant non-invariant tsc change frequency when changing state from C0 (operating) to C1 (halted)? My current view is that it

Do FP and integer division compete for the same throughput resources on x86 CPUs?

那年仲夏 提交于 2020-08-04 05:43:21
问题 We know that Intel CPUs do integer division and FP div / sqrt on a not-fully-pipelined divide execution unit on port 0. We know this from IACA output, other published stuff, and experimental testing. (e.g. https://agner.org/optimize/) But are there independent dividers for FP and integer (competing only for dispatch via port 0), or does interleaving two div-throughput-bound workloads make their cost add nearly linearly, if one is integer and the other is FP? This is complicated by Intel CPUs

Is there any situation where using MOVDQU and MOVUPD is better than MOVUPS?

眉间皱痕 提交于 2020-07-29 12:08:44
问题 I was trying to understand the different MOV instructions for SSE on intel x86-64. According to this you should use aligned instructions (MOVAPS, MOVAPD and MOVDQA) when moving data between 2 registers, using the correct one for the type you're operating with. And use MOVUPS/MOVAPS when moving register to memory and vice-versa, since type does not impact performance when moving to/from memory. So is there any reason to use MOVDQU and MOVUPD ever? Is the explanation I got on the link wrong?

Why does intel use a virtual index physical tagged cache and not VIVT or PIPT?

纵然是瞬间 提交于 2020-07-18 06:07:25
问题 I am not sure, but if i remember right intel uses a VIPT cache, i would like to know the reason of this choice, why is it better than VIVT or PIPT, what advantages does it procure and maybe what disadvantages. Thank you. 回答1: The exact design decisions are probably not published, but in general the benefits for VIPT are : Virtual indexing means you can start reading the set from the cache before (or in parallel with) looking up the translation in the TLB. This means that the common case

Why does intel use a virtual index physical tagged cache and not VIVT or PIPT?

牧云@^-^@ 提交于 2020-07-18 06:06:42
问题 I am not sure, but if i remember right intel uses a VIPT cache, i would like to know the reason of this choice, why is it better than VIVT or PIPT, what advantages does it procure and maybe what disadvantages. Thank you. 回答1: The exact design decisions are probably not published, but in general the benefits for VIPT are : Virtual indexing means you can start reading the set from the cache before (or in parallel with) looking up the translation in the TLB. This means that the common case

Using perf to monitor raw event counters

走远了吗. 提交于 2020-07-17 06:48:54
问题 I am trying to measure certain hardware events on a (Intel Xeon) machine with multiple (physical) processors. Specifically, I wish to know how many requests are issued for reading 'offcore' data. I found the OFFCORE_REQUESTS hardware event in Intels documentation and it gives the event descriptor 0xB0 and for data demands, the additional mask 0x01. Would it then be correct to tell perf to record the event 0xB1 (i.e. 0xB0 | 0x01 ) and to call it as: perf record -e r0B1 ./mytestapp someargs Or

Confused about Intel Optane DC SSD usage as extra RAM with IMDT? [closed]

旧时模样 提交于 2020-07-15 09:46:06
问题 Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 12 days ago . Improve this question I'm a little confused about Intel Optane DC. I want that my Optane DC will be able to perform as DRAM and storage both. On the one hand, I understood that only "Intel Optane DC Persistent Memory DIMM" is able to perform as DRAM.That it because he has 2 modes

x86-64 canonical address?

放肆的年华 提交于 2020-07-15 07:08:08
问题 During reading of an Intel manual book I came across the following: On processors that support Intel 64 architecture, the IA32_SYSENTER_ESP field and the IA32_SYSENTER_EIP field must each contain a canonical address. What is a 'canonical address'? 回答1: I suggest that you download the full software developer's manual. The documentation is available in separate volumes, but that link gives you all seven volumes in a single massive PDF, which makes it easier to search for things. The answer is

x86-64 canonical address?

隐身守侯 提交于 2020-07-15 07:08:07
问题 During reading of an Intel manual book I came across the following: On processors that support Intel 64 architecture, the IA32_SYSENTER_ESP field and the IA32_SYSENTER_EIP field must each contain a canonical address. What is a 'canonical address'? 回答1: I suggest that you download the full software developer's manual. The documentation is available in separate volumes, but that link gives you all seven volumes in a single massive PDF, which makes it easier to search for things. The answer is

Reading Binary on different Architectures - Fortran runtime error: I/O past end of record on unformatted file

折月煮酒 提交于 2020-07-10 10:27:19
问题 I am having some issues on reading a binary (unformatted restart file ~2GB ) written within a Fortran program, by the call here below: open(unit=1,file=opfile,status="unknown",form="unformatted") ! write(1) t ! write(1) Rho, Rho_ut, Rho_ur, Rho_uz, Rho_Ya ! close(1) that has been compiled with ifort on an Intel Xeon Phi 7250 CPU (KNL) architecture. When the same code, that has now been compiled with gfortran on an IBM POWER9 AC922 architecture, read this file with the call here below: open