Is it possible to detect the CPU architecture from machine code?

為{幸葍}努か 提交于 2019-12-01 04:24:30

问题


Let's say that there are 2 possible architectures, ARM and x86. Is there a way to detect what system the code is running on, to achieve something like this from assembly/machine code?

if (isArm)
    jmp to arm machine code
if (isX86)
    jmp to x86 machine code

I know that ARM machine code differs from x86 machine code significantly. What I'm thinking about is some well crafted assembly instructions that would result in the same binary machine code.


回答1:


Assuming you have already taken care of all other differences1 and you are left with writing a small polyglot trampoline, you can use these opcodes:

EB 02 00 EA

Which, when put at address 0, for ARM (non thumb), translates into:

00000000: b 0xbb4
00000004: ...

But for x86 (real mode), translates to:

0000:0000 jmp 04h
0000:0002 add dl, ch
0000:0004 ...

You can then put more elaborate x86 code at address 04h and ARM code at address 0bb4h.

Of course, when relocating the base address, make sure to relocate the jump targets too.


1 For example, ARM starts at address 0 while x86 starts at address 0fffffff0h, so you need a specific hardware/firmware support to abstract the boot address.




回答2:


http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0363g/Beijdcef.html

https://electronics.stackexchange.com/a/232934

How to setup ARM interrupt vector table branches in C or inline assembly?

http://osnet.cs.nchu.edu.tw/powpoint/Embedded94_1/Chapter%207%20ARM%20Exceptions.pdf

ARM Undefined Instruction error

ARM assembly is not my area of expertise, but I have programmed a lot in x86 assembly. I remember I had this same question as homework back in college. The solution I found was interrupt 06h (http://webpages.charter.net/danrollins/techhelp/0103.HTM , https://es.wikipedia.org/wiki/Llamada_de_interrupci%C3%B3n_del_BIOS#Tabla_de_interrupciones). This interrupt is fired everytime the microprocessor tries to execute an unknown instruction ("invalid opcode").

8086 gets stucked when an invalid opcode is found, because the IP (instruction pointer) returns to the same invalid instruction, where it tries to re-execute it, this loop stucks the execution of the program.

Starting with 80286 interrupt 06h is fired, so the programmer can handle the invalid opcode cases.

Interrupt 06h helps to detect the CPU architecture, by simply trying to execute an x64 opcode, if interrrupt 06h is fired, the CPU did not recognize it, so it is x86, otherwise it is x64.

This technique can be also used to detect the type of microprocessor :

  • Try to execute a 80286 instruction, if interrupt 06h is not fired, CPU is, at least, 8286.
  • Try to execute a 80386 instruction, if interrupt 06h is not fired, CPU is, at least, 8386.
  • And so on...

http://mtech.dk/thomsen/program/ioe.php

https://software.intel.com/en-us/articles/introduction-to-x64-assembly




回答3:


It's not possible in assembly or machine code because the machine code will depend on the architecture. So your if statement must first be compiled into either ARM or x86. If it compiled as ARM it cannot run on x86 without an emulator and if it compiled as x86 it cannot run on ARM without an emulator.

If you do run the code in an emulator than the code is basically running in a virtual version of the CPU it was compiled for. Depending on the emulator, you may or may not be able to detect that you are running on an emulator. And depending on the emulator, if the emulator allows your code to detect that you are running on an emulator you may not be able to detect the underlying CPU and/or OS (for example, you may not be able to detect if the x86 emulator is running on x86 or ARM).

Now, if you are very lucky, you may find two CPU architectures where the conditional branch or conditional goto instruction of one architecture does either something useful in your code or does nothing in the other architecture and vice versa. So if this is the case you can construct a binary executable that can run on two different CPU architectures.


How multi-architecture binary works in real life.

In real life, a multi architecture binary is actually two complete programs with shared resources (icons, images etc.) and the program binary format includes a header or preamble to tell the OS what CPUs are supported and where to find the main() function for each CPU.

One of the best historical examples I can think of of this is the Mac OS. The Mac changed CPUs twice: first from 68k to PowerPC then from PowerPC to x86. At each stage they had to come up with a file format that contained the binary executables of two CPU architectures.


Note on real-world executables

Real-life programs are almost never raw binary executable. The binary code are always contained in another format that contains metadata and resources. Windows for example uses the PE format and Linux uses ELF. But some OSes support more than one type of executable container (though the actually binary machine code can be the same). For example, Linux traditionally supports ELF, COFF and ECOFF.



来源:https://stackoverflow.com/questions/38055792/is-it-possible-to-detect-the-cpu-architecture-from-machine-code

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!