multicore

Python multicore programming [duplicate]

烈酒焚心 提交于 2019-11-28 23:40:10
This question already has an answer here: Threading in Python [closed] 7 answers Please consider a class as follow: class Foo: def __init__(self, data): self.data = data def do_task(self): #do something with data In my application I've a list containing several instances of Foo class. The aim is to execute do_task for all Foo objects. A first implementation is simply: #execute tasks of all Foo Object instantiated for f_obj in my_foo_obj_list: f_obj.do_task() I'd like to take advantage of multi-core architecture sharing the for cycle between 4 CPUs of my machine. What's the best way to do it?

High-level Compare And Swap (CAS) functions?

随声附和 提交于 2019-11-28 23:02:26
问题 I'd like to document what high-level (i.e. C++ not inline assembler ) functions or macros are available for Compare And Swap (CAS) atomic primitives... E.g., WIN32 on x86 has a family of functions _InterlockedCompareExchange in the <_intrin.h> header. 回答1: I'll let others list the various platform-specific APIs, but for future reference in C++09 you'll get the atomic_compare_exchange() operation in the new "Atomic operations library". 回答2: glib, a common system library on Linux and Unix

Tomcat SOLR multiple cores setup

回眸只為那壹抹淺笑 提交于 2019-11-28 21:31:34
I have spend all morning trying to set up multiple cores on a SOLR installation that runs under Apache Tomcat server without success. My solr.xml looks like this: <solr persistent="false" sharedLib="lib"> <cores adminPath="/admin/cores"> <core name="core0" instanceDir="/multicore/core0"> <property name="dataDir" value="/multicore/core0/data" /> </core> <core name="core1" instanceDir="/multicore/core1"> <property name="dataDir" value="/multicore/core1/data" /> </core> </cores> </solr> What is the correct directory structure? Do I need to do change something in the solrconfig.xml? Check that

How to ensure Java threads run on different cores

谁说胖子不能爱 提交于 2019-11-28 20:46:55
I am writing a multi-threaded application in Java in order to improve performance over the sequential version. It is a parallel version of the dynamic programming solution to the 0/1 knapsack problem. I have an Intel Core 2 Duo with both Ubuntu and Windows 7 Professional on different partitions. I am running in Ubuntu. My problem is that the parallel version actually takes longer than the sequential version. I am thinking this may be because the threads are all being mapped to the same kernel thread or that they are being allocated to the same core. Is there a way I could ensure that each Java

Memory Fences - Need help to understand

六月ゝ 毕业季﹏ 提交于 2019-11-28 20:37:01
问题 I'm reading Memory Barriers by Paul E. McKenney http://www.rdrop.com/users/paulmck/scalability/paper/whymb.2010.07.23a.pdf everything is explained in great details and when I see that everything is clear I encounter one sentence, which stultifies everything and make me think that I understood nothing. Let me show the example void foo(void) { a = 1; #1 b = 1; #2 } void bar(void) { while (b == 0) continue; #3 assert(a == 1); #4 } let's say this two functions are running on a different

Multicore + Hyperthreading - how are threads distributed?

蓝咒 提交于 2019-11-28 19:00:08
问题 I was reading a review of the new Intel Atom 330, where they noted that Task Manager shows 4 cores - two physical cores, plus two more simulated by Hyperthreading. Suppose you have a program with two threads. Suppose also that these are the only threads doing any work on the PC, everything else is idle. What is the probability that the OS will put both threads on the same core? This has huge implications for program throughput. If the answer is anything other than 0%, are there any mitigation

Why does a single threaded process execute on several processors/cores?

这一生的挚爱 提交于 2019-11-28 18:57:09
Say I run a simple single-threaded process like the one below: public class SirCountALot { public static void main(String[] args) { int count = 0; while (true) { count++; } } } (This is Java because that's what I'm familiar with, but I suspect it doesn't really matter) I have an i7 processor (4 cores, or 8 counting hyperthreading), and I'm running Windows 7 64-bit so I fired up Sysinternals Process Explorer to look at the CPU usage, and as expected I see it is using around 20% of all available CPU. But when I toggle the option to show 1 graph per CPU, I see that instead of 1 of the 4 "cores"

Do multi-core CPUs share the MMU and page tables?

情到浓时终转凉″ 提交于 2019-11-28 18:50:52
On a single core computer, one thread is executing at a time. On each context switch the scheduler checks if the new thread to schedule is in the same process than the previous one. If so, nothing needs to be done regarding the MMU (pages table). In the other case, the pages table needs to be updated with the new process pages table. I am wondering how things happen on a multi-core computer. I guess there is a dedicated MMU on each core, and if two threads of the same process are running simultaneously on 2 cores, each of this core's MMU simply refer to the same page table. Is this true ? Can

Parallel map in haskell

╄→尐↘猪︶ㄣ 提交于 2019-11-28 18:38:01
Is there some substitute of map which evaluates the list in parallel? I don't need it to be lazy. Something like: pmap :: (a -> b) -> [a] -> [b] letting me pmap expensive_function big_list and have all my cores at 100%. Yes, see the parallel package : ls `using` parList rdeepseq will evaluate each element of the list in parallel via the rdeepseq strategy. Note the use of parListChunk with a good chunk value might give better performance if your elements are too cheap to get a benefit evaluating each one in parallel (because it saves on sparking for each element). EDIT: Based on your question I

GPGPU vs. Multicore?

纵然是瞬间 提交于 2019-11-28 18:37:23
问题 What are the key practical differences between GPGPU and regular multicore/multithreaded CPU programming, from the programmer's perspective? Specifically: What types of problems are better suited to regular multicore and what types are better suited to GPGPU? What are the key differences in programming model? What are the key underlying hardware differences that necessitate any differences in programming model? Which one is typically easier to use and by how much? Is it practical, in the long