Are there any modern CPUs where a cached byte store is actually slower than a word store?
问题 It's a common claim that a byte store into cache may result in an internal read-modify-write cycle, or otherwise hurt throughput or latency vs. storing a full register. But I've never seen any examples. No x86 CPUs are like this, and I think all high-performance CPUs can directly modify any byte in a cache-line, too. Are some microcontrollers or low-end CPUs different, if they have cache at all? ( I'm not counting word-addressable machines , or Alpha which is byte addressable but lacks byte