Is it possible to create threads without system calls in Linux x86 GAS assembly?

前端 未结 7 704
轮回少年
轮回少年 2020-12-07 12:30

Whilst learning the \"assembler language\" (in linux on a x86 architecture using the GNU as assembler), one of the aha moments was the possibility of using system calls. The

7条回答
  •  感动是毒
    2020-12-07 12:46

    "Doctor, doctor, it hurts when I do this". Doctor: "Don't do that".

    The short answer is you can do multithreaded programming without calling expensive OS task management primitives. Simply ignore the OS for thread scheduling operations. This means you have to write your own thread scheduler, and simply never pass control back to the OS. (And you have to be cleverer somehow about your thread overhead than the pretty smart OS guys). We chose this approach precisely because windows process/thread/ fiber calls were all too expensive to support computation grains of a few hundred instructions.

    Our PARLANSE programming langauge is a parallel programming language: See http://www.semdesigns.com/Products/Parlanse/index.html

    PARLANSE runs under Windows, offers parallel "grains" as the abstract parallelism construct, and schedules such grains by a combination of a highly tuned hand-written scheduler and scheduling code generated by the PARLANSE compiler that takes into account the context of grain to minimimze scheduling overhead. For instance, the compiler ensures that the registers of a grain contain no information at the point where scheduling (e.g., "wait") might be required, and thus the scheduler code only has to save the PC and SP. In fact, quite often the scheduler code doesnt get control at all; a forked grain simply stores the forking PC and SP, switches to compiler-preallocated stack and jumps to the grain code. Completion of the grain will restart the forker.

    Normally there's an interlock to synchronize grains, implemented by the compiler using native LOCK DEC instructions that implement what amounts to counting semaphores. Applications can fork logically millions of grains; the scheduler limits parent grains from generating more work if the work queues are long enough so more work won't be helpful. The scheduler implements work-stealing to allow work-starved CPUs to grab ready grains form neighboring CPU work queues. This has been implemented to handle up to 32 CPUs; but we're a bit worried that the x86 vendors may actually swamp use with more than that in the next few years!

    PARLANSE is a mature langauge; we've been using it since 1997, and have implemented a several-million line parallel application in it.

提交回复
热议问题