Why are complicated memcpy/memset superior?

后端 未结 6 735
名媛妹妹
名媛妹妹 2020-12-05 07:18

When debugging, I frequently stepped into the handwritten assembly implementation of memcpy and memset. These are usually implemented using streaming instructions if availab

6条回答
  •  既然无缘
    2020-12-05 07:40

    Once upon a time rep movsb was the optimal solution.

    The original IBM PC had an 8088 processor with an 8-bit data bus and no caches. Then the fastest program was generally the one with the fewest number of instruction bytes. Having special instructions helped.

    Nowadays, the fastest program is the one that can use as many CPU features as possible in parallel. Strange as it might seem at first, having code with many simple instructions can actually run faster than a single do-it-all instruction.

    Intel and AMD keep the old instructions around mainly for backward compatibility.

提交回复
热议问题