hyperthreading

assign two MPI processes per core

有些话、适合烂在心里 提交于 2019-11-27 09:08:41
How do I assign 2 MPI processes per core? For example, if I do mpirun -np 4 ./application then it should use 2 physical cores to run 4 MPI processes (2 processes per core). I am using Open MPI 1.6. I did mpirun -np 4 -nc 2 ./application but wasn't able to run it. It complains mpirun was unable to launch the specified application as it could not find an executable: Hristo Iliev orterun (the Open MPI SPMD/MPMD launcher; mpirun/mpiexec are just symlinks to it) has some support for process binding but it is not flexible enough to allow you to bind two processes per core. You can try with -bycore

Programmatically detect number of physical processors/cores or if hyper-threading is active on Windows, Mac and Linux

假装没事ソ 提交于 2019-11-26 15:20:51
I have a multithreaded c++ application that runs on Windows, Mac and a few Linux flavours. To make a long story short: In order for it to run at maximum efficiency, I have to be able to instantiate a single thread per physical processor/core. Creating more threads than there are physical processors/cores degrades the performance of my program considerably. I can already correctly detect the number of logical processors/cores correctly on all three of these platforms. To be able to detect the number of physical processors/cores correctly I'll have to detect if hyper-treading is supported AND

assign two MPI processes per core

自古美人都是妖i 提交于 2019-11-26 14:31:48
问题 How do I assign 2 MPI processes per core? For example, if I do mpirun -np 4 ./application then it should use 2 physical cores to run 4 MPI processes (2 processes per core). I am using Open MPI 1.6. I did mpirun -np 4 -nc 2 ./application but wasn't able to run it. It complains mpirun was unable to launch the specified application as it could not find an executable: 回答1: orterun (the Open MPI SPMD/MPMD launcher; mpirun/mpiexec are just symlinks to it) has some support for process binding but it

What will be used for data exchange between threads are executing on one Core with HT?

£可爱£侵袭症+ 提交于 2019-11-26 03:59:06
问题 Hyper-Threading Technology is a form of simultaneous multithreading technology introduced by Intel. These resources include the execution engine, caches, and system bus interface; the sharing of resources allows two logical processors to work with each other more efficiently, and allows a stalled logical processor to borrow resources from the other one. In the Intel CPU with Hyper-Threading, one CPU-Core (with several ALUs) can execute instructions from 2 threads at the same clock. And both 2

What are the latency and throughput costs of producer-consumer sharing of a memory location between hyper-siblings versus non-hyper siblings?

我的梦境 提交于 2019-11-26 03:28:41
问题 Two different threads within a single process can share a common memory location by reading and/or writing to it. Usually, such (intentional) sharing is implemented using atomic operations using the lock prefix on x86, which has fairly well-known costs both for the lock prefix itself (i.e., the uncontended cost) and also additional coherence costs when the cache line is actually shared (true or false sharing). Here I\'m interested in produced-consumer costs where a single thread P writes to a