Say I have the defacto standard x86 CPU with 3 level of Caches, L1/L2 private, and L3 shared among cores. Is there a way to allocate shared memory whose data will not be cac
Intel has recently announced a new instruction that seems to be relevant to this question. The instruction is called CLDEMOTE. It moves data from higher level caches to a lower level cache. (Probably from L1 or L2 to L3, although the spec isn't precise on the details.) "This may accelerate subsequent accesses to the line by other cores ...."
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf