multicore | 易学教程

numpy on multicore hardware

阅读更多关于 numpy on multicore hardware

问题 What's the state of the art with regards to getting numpy to use mutliple cores (on Intel hardware) for things like inner and outer vector products, vector-matrix multiplications etc? I am happy to rebuild numpy if necessary, but at this point I am looking at ways to speed things up without changing my code. For reference, my show_config() is as follows, and I've never observed numpy to use more than one core: atlas_threads_info: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] library

Tomcat SOLR multiple cores setup

阅读更多关于 Tomcat SOLR multiple cores setup

问题 I have spend all morning trying to set up multiple cores on a SOLR installation that runs under Apache Tomcat server without success. My solr.xml looks like this: <solr persistent="false" sharedLib="lib"> <cores adminPath="/admin/cores"> <core name="core0" instanceDir="/multicore/core0"> <property name="dataDir" value="/multicore/core0/data" /> </core> <core name="core1" instanceDir="/multicore/core1"> <property name="dataDir" value="/multicore/core1/data" /> </core> </cores> </solr> What is

How can I write a lock free structure?

阅读更多关于 How can I write a lock free structure?

问题 In my multithreaded application and I see heavy lock contention in it, preventing good scalability across multiple cores. I have decided to use lock free programming to solve this. How can I write a lock free structure? 回答1: Short answer is: You cannot. Long answer is: If you are asking this question, you do not probably know enough to be able to create a lock free structure. Creating lock free structures is extremely hard, and only experts in this field can do it. Instead of writing your own

How to ensure Java threads run on different cores

阅读更多关于 How to ensure Java threads run on different cores

问题 I am writing a multi-threaded application in Java in order to improve performance over the sequential version. It is a parallel version of the dynamic programming solution to the 0/1 knapsack problem. I have an Intel Core 2 Duo with both Ubuntu and Windows 7 Professional on different partitions. I am running in Ubuntu. My problem is that the parallel version actually takes longer than the sequential version. I am thinking this may be because the threads are all being mapped to the same kernel

Parallel map in haskell

阅读更多关于 Parallel map in haskell

问题 Is there some substitute of map which evaluates the list in parallel? I don't need it to be lazy. Something like: pmap :: (a -> b) -> [a] -> [b] letting me pmap expensive_function big_list and have all my cores at 100%. 回答1: Yes, see the parallel package: ls `using` parList rdeepseq will evaluate each element of the list in parallel via the rdeepseq strategy. Note the use of parListChunk with a good chunk value might give better performance if your elements are too cheap to get a benefit

How do SMP cores, processes, and threads work together exactly?

阅读更多关于 How do SMP cores, processes, and threads work together exactly?

问题 On a single core CPU, each process runs in the OS, and the CPU jumps around from one process to another to best utilize itself. A process can have many threads, in which case the CPU runs through these threads when it is running on the respective process. Now, on a multiple core CPU: Do the cores run in every process together, or can the cores run separately in different processes at one particular point of time? For instance, you have program A running two threads. Can a dual core CPU run

How to control which core a process runs on?

阅读更多关于 How to control which core a process runs on?

问题 I can understand how one can write a program that uses multiple processes or threads: fork() a new process and use IPC, or create multiple threads and use those sorts of communication mechanisms. I also understand context switching. That is, with only once CPU, the operating system schedules time for each process (and there are tons of scheduling algorithms out there) and thereby we achieve running multiple processes simultaneously. And now that we have multi-core processors (or multi

How can I get the CPU core number from within a user-space app (Linux, C)?

阅读更多关于 How can I get the CPU core number from within a user-space app (Linux, C)?

问题 Presumably there is a library or simple asm blob that can get me the number of the current CPU that I am executing on. 回答1: Use sched_getcpu to determine the CPU on which the calling thread is running. See man getcpu (the system call) and man sched_getcpu (a library wrapper). However, note what it says: The information placed in cpu is only guaranteed to be current at the time of the call: unless the CPU affinity has been fixed using sched_setaffinity(2), the kernel might change the CPU at

Is volatile bool for thread control considered wrong?

阅读更多关于 Is volatile bool for thread control considered wrong?

问题 As a result of my answer to this question, I started reading about the keyword volatile and what the consensus is regarding it. I see there is a lot of information about it, some old which seems wrong now and a lot new which says it has almost no place in multi-threaded programming. Hence, I'd like to clarify a specific usage (couldn't find an exact answer here on SO). I also want to point out I do understand the requirements for writing multi-threaded code in general and why volatile is not

x86 LOCK question on multi-core CPUs

阅读更多关于 x86 LOCK question on multi-core CPUs

问题 Is it true that the x86 ASM "LOCK" command prefix causes all cores to freeze while the instruction following "LOCK" is being executed? I read this in a blog post and it doesn't make sense. I can't find anything that indicates if this is true or not. 回答1: It's about locking the memory bus for that address. The Intel 64 and IA-32 Architectures Software Developer's Manual - Volume 3A: System Programming Guide, Part 1 tells us: 7.1.4 Effects of a LOCK Operation on Internal Processor Caches. For