assembly

Does cmpxchg write destination cache line on failure? If not, is it better than xchg for spinlock?

二次信任 提交于 2020-08-08 06:19:28
问题 I assume simple spinlock that does not go to OS waiting for the purposes of this question. I see that simple spinlock is often implemented using lock xchg or lock bts instead of lock cmpxchg . But doesn't cmpxchg avoid writing the value if the expectation does not match? So aren't failed attempts cheaper with cmpxchg ? Or does cmpxchg write data and invalidate cache line of other cores even on failure? This question is similar to What specifically marks an x86 cache line as dirty - any write,

Are Intel TSX prefixes executed (safely) on AMD as NOP?

微笑、不失礼 提交于 2020-08-07 17:44:07
问题 I have MASM synchronizing code for an application which runs on both Intel and AMD x86 machines. I'd like to enhance it using the Intel TSX prefixes, specifically XACQUIRE and XRELEASE. If I modify my code correctly for Intel, what will happen when I attempt to run it on AMD machines? Intel says that these were designed to be backwards compatible, presumably meaning they do nothing on Intel CPUs without TSX. I know that AMD has not implemented TSX. But are these prefixes safe to run on AMD

Are Intel TSX prefixes executed (safely) on AMD as NOP?

十年热恋 提交于 2020-08-07 17:43:15
问题 I have MASM synchronizing code for an application which runs on both Intel and AMD x86 machines. I'd like to enhance it using the Intel TSX prefixes, specifically XACQUIRE and XRELEASE. If I modify my code correctly for Intel, what will happen when I attempt to run it on AMD machines? Intel says that these were designed to be backwards compatible, presumably meaning they do nothing on Intel CPUs without TSX. I know that AMD has not implemented TSX. But are these prefixes safe to run on AMD

Are Intel TSX prefixes executed (safely) on AMD as NOP?

左心房为你撑大大i 提交于 2020-08-07 17:40:50
问题 I have MASM synchronizing code for an application which runs on both Intel and AMD x86 machines. I'd like to enhance it using the Intel TSX prefixes, specifically XACQUIRE and XRELEASE. If I modify my code correctly for Intel, what will happen when I attempt to run it on AMD machines? Intel says that these were designed to be backwards compatible, presumably meaning they do nothing on Intel CPUs without TSX. I know that AMD has not implemented TSX. But are these prefixes safe to run on AMD

X86: What does `movsxd rdx,edx` instruction mean?

烂漫一生 提交于 2020-08-07 09:23:48
问题 I have been playing with intel mpx and found that it adds certain instructions that I could not understand. For e.g. (in intel format): movsxd rdx,edx I found this, which talks about a similar instruction - MOVSX . From that question, my interpretation of this instruction is that, it takes double byte (that's why there is a d in movsxd ) and it copies it into rdx register (in two least significant bytes) and fills the rest with the sign of that double byte. Is my interpretation correct (I

X86: What does `movsxd rdx,edx` instruction mean?

烈酒焚心 提交于 2020-08-07 09:18:49
问题 I have been playing with intel mpx and found that it adds certain instructions that I could not understand. For e.g. (in intel format): movsxd rdx,edx I found this, which talks about a similar instruction - MOVSX . From that question, my interpretation of this instruction is that, it takes double byte (that's why there is a d in movsxd ) and it copies it into rdx register (in two least significant bytes) and fills the rest with the sign of that double byte. Is my interpretation correct (I

X86: What does `movsxd rdx,edx` instruction mean?

谁说我不能喝 提交于 2020-08-07 09:17:32
问题 I have been playing with intel mpx and found that it adds certain instructions that I could not understand. For e.g. (in intel format): movsxd rdx,edx I found this, which talks about a similar instruction - MOVSX . From that question, my interpretation of this instruction is that, it takes double byte (that's why there is a d in movsxd ) and it copies it into rdx register (in two least significant bytes) and fills the rest with the sign of that double byte. Is my interpretation correct (I

How to write a custom bootloader for mac systems?

六眼飞鱼酱① 提交于 2020-08-06 05:18:58
问题 I wrote a little bootloader in assembly and it uses BIOS interrupts and it works great on my pc. My question is, is there any possibility to make it work on Mac / Apple systems. I know that Apple doesn't use BIOS in that sense and that they are locking lot of things down. However it is possible to use a live ubuntu stick on mac, so might it be possible to run an assembly program from startup on a mac? If yes, do you know any starting points or references of what has been done before? Thanks a

Will C++ first gets converted to assembly [duplicate]

吃可爱长大的小学妹 提交于 2020-08-05 10:18:10
问题 This question already has answers here : Does the C++ code compile to assembly codes? (3 answers) Does a compiler always produce an assembly code? (4 answers) Closed 7 years ago . I have confusion. I am C++ developer and heard many times that my source code will first gets converted to assembly and then assembly will get converted to machine code. But in one of the video tutorial of assembly language, instructor clearly said, C/C++ code directly gets convert to machine code. (Of course there

Do FP and integer division compete for the same throughput resources on x86 CPUs?

那年仲夏 提交于 2020-08-04 05:43:21
问题 We know that Intel CPUs do integer division and FP div / sqrt on a not-fully-pipelined divide execution unit on port 0. We know this from IACA output, other published stuff, and experimental testing. (e.g. https://agner.org/optimize/) But are there independent dividers for FP and integer (competing only for dispatch via port 0), or does interleaving two div-throughput-bound workloads make their cost add nearly linearly, if one is integer and the other is FP? This is complicated by Intel CPUs