Why doesn't GCC use partial registers?

前端 未结 3 1385
[愿得一人]
[愿得一人] 2020-11-22 01:01

Disassembling write(1,\"hi\",3) on linux, built with gcc -s -nostdlib -nostartfiles -O3 results in:

ba03000000     mov edx, 3 ; tha         


        
3条回答
  •  不思量自难忘°
    2020-11-22 01:28

    On something like the original IBM PC, if AH was known to contain 0 and it was necessary to load AX with a value like 0x34, using "MOV AL,34h" would generally take 8 cycles rather than the 12 required for "MOV AX,0034h"--a pretty big speed improvement (either instruction could execute in 2 cycles if pre-fetched, but in practice the 8088 spends most of its time waiting for instructions to be fetched at a cost of four cycles per byte). On the processors used in today's general-purpose computers, however, the time required to fetch code is generally not a significant factor in overall execution speed, and code size is normally not a particular concern.

    Further, processor vendors try to maximize the performance of the kinds of code people are likely to run, and 8-bit load instructions aren't likely to be used nearly as often nowadays as 32-bit load instructions. Processor cores often include logic to execute multiple 32-bit or 64-bit instructions simultaneously, but may not include logic to execute an 8-bit operation simultaneously with anything else. Consequently, while using 8-bit operations on the 8088 when possible was a useful optimization on the 8088, it can actually be a significant performance drain on newer processors.

提交回复
热议问题