cpu

Why can't a load bypass a value written by another thread on the same core from a write buffer?

不羁岁月 提交于 2019-12-24 10:06:43
问题 If a CPU core uses a write buffer, then the load can bypass the most recent store to the referenced location from the write buffer, without waiting until it will appear in the cache. But, as it's written in A Primer on Memory Consistency and Coherence, if the CPU honors TSO memory model, then ... multithreading introduces a subtle write buffer issue for TSO. TSO write buffers are logically private to each thread context (virtual core). Thus, on a multithreaded core, one thread context should

Am I extracting these fields correctly using bitwise shift? (tag, index, offset)

纵饮孤独 提交于 2019-12-24 09:33:37
问题 I am building a CPU cache emulator in C. I was hoping you could tell me if I am extracting these fields correctly: The 32-bit address should be broken up as follows: +---------------------------------------------------+ | tag (20 bits) | index (10 bits) | offset (2 bits) | +---------------------------------------------------+ Here is my code to obtain the values for each: void extract_fields(unsigned int address){ unsigned int tag, index, offset; // Extract tag tag = address >> 12; // Extract

CPU usage shooting up when I am running a for-in loop: OS-x app

坚强是说给别人听的谎言 提交于 2019-12-24 08:12:56
问题 When my loop runs for about 15k times too much CPU is being used. It's in the range of 90 -100 % always. What I am doing inside the loop is a series of Core data entity updating/creation. Why does that happen? I am running my process in background and I have specified the thread priority to minimum value too! 回答1: Paul R is correct - the system will achieve the tasks in the loop as quickly as possible given the resources it has. In this case, it is manipulating Core Data objects in the

Performance of multi-threading exceeding cores

六眼飞鱼酱① 提交于 2019-12-24 05:48:21
问题 If I have a process that starts X amount of threads, will there ever be a performance gain having X higher than the number of CPU cores (assuming all the threads are working synchronously without async calls to storage/network)? E.G. If I have a two cores CPU , will I just slow down the application starting 3+ constantly working threads? 回答1: It really depends on what your code does. it is too broad. Having more threads than cores might speed up the program for example if some of the threads

Linux kernel idle loop

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-24 02:53:10
问题 Inside the linux kernel idle loop, for quite a few architectures (SH, ARM, X86 etc.. afaik) are the following lines: if(cpuidle_idle_call()) pm_idle(); My doubt: At-least for ARM, The default pm_idle function consists of WFI (Wait for interrupt) instruction but the confusing part is, interrupts are disabled then and are enabled after the WFI instruction executes, How does a CPU get back online from WFI when interrupts were disabled ? I tried searching for my answers in various versions of

Linux kernel idle loop

夙愿已清 提交于 2019-12-24 02:53:04
问题 Inside the linux kernel idle loop, for quite a few architectures (SH, ARM, X86 etc.. afaik) are the following lines: if(cpuidle_idle_call()) pm_idle(); My doubt: At-least for ARM, The default pm_idle function consists of WFI (Wait for interrupt) instruction but the confusing part is, interrupts are disabled then and are enabled after the WFI instruction executes, How does a CPU get back online from WFI when interrupts were disabled ? I tried searching for my answers in various versions of

What kind of address instruction does the x86 cpu have?

送分小仙女□ 提交于 2019-12-24 00:54:37
问题 I learned about one address, two address, and three address instruction, but now I'd like to know, what kind of address instruction does x86 use? 回答1: x86 is a register machine, where at most 1 operand for any instruction can be an explicit memory address instead of a register, using an addressing mode like [rdi + rax*4] . (There are instruction which can have 2 memory operands with one or both being implicit, though: What x86 instructions take two (or more) memory operands?) Typical x86

What is the right command line to get cpu usage for certain process in java

拥有回忆 提交于 2019-12-23 18:23:17
问题 Given a process ID, what is the right command to get the current cpu usage from the process, in Java. The command typeperf "\Memory\Available bytes" "\processor(_total)\% processor time" is not for a specific process, and any 3rd party utility is not a option, ProcDump alike. Thanks for any pointers! 回答1: Try http://support.hyperic.com/display/SIGAR/Home Otherwise look at How can a Java program get its own process ID? 来源: https://stackoverflow.com/questions/7359245/what-is-the-right-command

slurm: use a control node also for computing

杀马特。学长 韩版系。学妹 提交于 2019-12-23 17:22:39
问题 I have set up a small cluster (9 nodes) for computing in our lab. Currrently I am using one node as slurm controller, i.e. it is not being used for computing. I would like to use it too, but I do not want to allocate all the CPUs, I would like to keep 2 CPU free for scheduling and other master-node-related tasks. Is it possible to write something like that in slurm.conf : NodeName=master NodeHostname=master CPUs=10 RealMemory=192000 TmpDisk=200000 State=UNKNOWN NodeName=node0[1-8]

How does branch prediction interact with the instruction pointer

荒凉一梦 提交于 2019-12-23 16:30:27
问题 It's my understanding that at the beginning of a processor's pipeline, the instruction pointer (which points to the address of the next instruction to execute) is updated by the branch predictor after fetching, so that this new address can then be fetched on the next cycle. However, if the instruction pointer is modified early on in the pipeline, wouldn't this affect instructions currently in the execute phase that might rely on the old instruction pointer value? For instance, when doing a