I\'ve been told and have read from Intel\'s manuals that it is possible to write instructions to memory, but the instruction prefetch queue has already fetched the stale ins
Sandybridge-family (at least Skylake) still has the same behaviour, apparently snooping on physical address.
Your test is somewhat overcomplicated, though. I don't see the point of the far jump, and if you assemble (and link if necessary) the SMC function into a flat binary you can just open + mmap it twice. Make a1 and a2 function pointers, then main can return a1(a2) after mapping.
Here's a simple test harness, in case anyone wants to try on their own machine: (The open/assert/mmap block was copied from the question, thanks for the starting point.)
(Downside, you have to rebuild the SMC flat binary every time, because mapping it with MAP_SHARED actually modifies it. IDK how to get two mappings of the same physical page that won't modify the underlying file; writing to a MAP_PRIVATE would COW it to a different physical page. So writing the machine code to a file and them mapping it makes sense now that I realize this. But my asm is still a lot simpler.)
// smc-stale.c
#include
#include
#include
#include
#include
typedef int (*intfunc_t)(void *); // __attribute__((sysv_abi)) // in case you're on Windows.
int main() {
int fd = open("smc-func", O_RDWR);
assert(fd>=0);
intfunc_t a1 = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_FILE | MAP_SHARED, fd, 0);
intfunc_t a2 = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_FILE | MAP_SHARED, fd, 0);
assert(a1 != a2);
return a1(a2);
}
NASM source for the test function:
(See How to generate plain binaries like nasm -f bin with the GNU GAS assembler? for an as+ld alternative to nasm -f)
;;build with nasm smc-func.asm -fbin is the default.
bits 64
entry: ; rdi = another mapping of the same page that's executing
mov byte [rdi+dummy-entry], 0xcc ; trigger any copy-on-write page fault now
mov r8, rbx ; CPUID steps on call-preserved RBX
cpuid ; serialize for good measure
mov rbx, r8
; mfence
; lfence
mov dword [rdi + retmov+1 - entry], 0 ; return 0 for snooping
retmov:
mov eax, 1 ; opcode + imm32 ; return 1 for stale
ret
dummy: dd 0xcccccccc
On an i7-6700k running Linux 4.20.3-arch1-1-ARCH, we do not observe stale code fetch. The mov that overwrote the immediate 1 with a 0 did modify that instruction before it ran.
peter@volta:~/src/experiments$ gcc -Og -g smc-stale.c
peter@volta:~/src/experiments$ nasm smc-func.asm && ./a.out; echo $?
0
# remember to rebuild smc-func every time, because MAP_SHARED modifies it