micro-architecture

how are barriers/fences and acquire, release semantics implemented microarchitecturally?

半腔热情 提交于 2021-02-16 12:57:07
问题 A lot of questions SO and articles/books such as https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.2018.12.08a.pdf, Preshing's articles such as https://preshing.com/20120710/memory-barriers-are-like-source-control-operations/ and his entire series of articles, talk about memory ordering abstractly, in terms of the ordering and visibility guarantees provided by different barriers types. My question is how are these barriers and memory ordering semantics

Way prediction in modern cache

泄露秘密 提交于 2021-02-09 09:17:46
问题 We know that the direct-mapped caches are better than set-associative cache in terms of the cache hit time as there is no search involved for a particular tag. On the other hand, set-associative caches usually show better-hit rate than direct-mapped caches. I read that the modern processors try to combine the benefit of both by using a technique called way-prediction. Where they predict the line of the given set where the hit is most likely to happen and search only in that line. If the

What are my available march/mtune options?

时光总嘲笑我的痴心妄想 提交于 2021-02-08 12:22:47
问题 Is there a way to get gcc to output the available -march=arch options? I'm getting build errors (tried -march=x86_64 ) and I don't know what my options are. The compiler I'm using is a proprietary wrapper around gcc that doesn't seem to like -march=skylake . The flags should be the same so I assume whatever options I'd send to gcc to dump architectures would be the same for this wrapper. I managed to cause gcc to error with a bogus parameter and it dumped a list, but I'm not seeing that now

What's 'new' in a 'new' processor when viewed from programmer's point

好久不见. 提交于 2021-01-28 09:30:52
问题 I have recently been interested in understanding low level computing. I understand that today's widely used computers follow x86/x86-64 architecture. To my understanding, architecture, more specifically Instruction Set Architecture (ISA) is the set of instructions that the programmer is able to issue to the CPU. The first question, Is the ISA keeps evolving or remains the same? I think that it keeps evolving (meaning new instructions keeps getting added/previous instructions modified?) but

What's 'new' in a 'new' processor when viewed from programmer's point

只愿长相守 提交于 2021-01-28 09:27:34
问题 I have recently been interested in understanding low level computing. I understand that today's widely used computers follow x86/x86-64 architecture. To my understanding, architecture, more specifically Instruction Set Architecture (ISA) is the set of instructions that the programmer is able to issue to the CPU. The first question, Is the ISA keeps evolving or remains the same? I think that it keeps evolving (meaning new instructions keeps getting added/previous instructions modified?) but

How does Intel X86 implements total order over stores

微笑、不失礼 提交于 2021-01-05 09:16:06
问题 X86 guarantees a total order over all stores due to its TSO memory model. My question is if anyone has an idea how this is actually implemented. I have a good impression how all the 4 fences are implemented, so I can explain how local order is preserved. But the 4 fences will just give PO; it won't give you TSO (I know TSO allows older stores to jump in front of newer loads so only 3 out of 4 fences are needed). Total order over all memory actions over a single address is responsibility of

Intel JCC Erratum - should JCC really be treated separately?

喜欢而已 提交于 2020-06-27 17:21:07
问题 Intel pushed microcode update to fix error called "Jump Conditional Code (JCC) Erratum". The update microcode caused some operation to be inefficient due to disabling putting code to ICache under certain conditions. Published document, titled Mitigations for Jump Conditional Code Erratum lists not only JCC , it lists: unconditional jumps, conditional jumps, macro-fused conditional jumps, calls, and return. MSVC switch /QIntel-jcc-erratum documentation mentions: Under /QIntel-jcc-erratum, the