mov & jmp to & jmp back vs call & ret

二次信任 提交于 2019-12-13 00:03:18

问题


I was going over some Assembly code and I saw this:

    mov r12, _read_loopr
    jmp _bzero
_read_loopr:
...
_bzero:
    inc r8
    mov byte [r8+r15], 0x0
    cmp r8, 0xff
    jle _bzero
    jmp r12

And I was wondering if there was any particular advantage to doing this (mov _read_loopr to a register jmp to the function and then jmp back) rather than the usual call _bzero and ret?


回答1:


This just looks like braindead code, especially if the return-address label is always right after the jmp _bzero like you say in your comment.

Maybe the author thought that they couldn't use call "because function calls clobber registers". This what you have to assume based on the calling convention if you're calling a function that isn't part of the same codebase. But you can call/ret to functions with custom calling conventions.

Of course, for code this small, it should have been inlined (i.e. make it a macro, not a function).

More importantly, something more clever than storing one byte at a time is normally possible, and probably worth a potential branch mispredict if there are more than a few bytes to zero. If at least 8 (or better, 16) bytes of data always need to be zeroed, you can do it with wide stores. Make the final store write the the last byte of the buffer to be zeroed, potentially overlapping with the previous store. (This is much better than ending with branches to decide to do a final 4B store, 2B store, and 1B store.) See the x86 tag wiki for resources about writing efficient asm.


If the return address was somewhere other than right after the jmp _bzero, then the worst possible thing would probably be push _read_loopr / jmp _bzero, and ret in _bzero. That would break the return-address predictor stack, leading to a mispredict on the next ~15 rets up the call tree.

Best would be to inline the loop and put a direct jmp after it.

I'm not sure how passing an address for _bzero to jmp to would compare with a call/ret and then a jmp after the call.

call/ret are fairly cheap, but not single-uop instructions on Intel. A jmp _bzero / jmp _read_loopr would be better if there was only one caller.



来源:https://stackoverflow.com/questions/38542382/mov-jmp-to-jmp-back-vs-call-ret

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!