shared-memory | 易学教程

Efficiency of Multithreaded Loops

阅读更多关于 Efficiency of Multithreaded Loops

Greetings noble community, I want to have the following loop: for(i = 0; i < MAX; i++) A[i] = B[i] + C[i]; This will run in parallel on a shared-memory quad-core computer using threads. The two alternatives below are being considered for the code to be executed by these threads, where tid is the id of the thread: 0, 1, 2 or 3. (for simplicity, assume MAX is a multiple of 4) Option 1: for(i = tid; i < MAX; i += 4) A[i] = B[i] + C[i]; Option 2: for(i = tid*(MAX/4); i < (tid+1)*(MAX/4); i++) A[i] = B[i] + C[i]; My question is if there's one that is more efficient then the other and why? The

Does madvise(_, _, MADV_DONTNEED) instruct the OS to lazily write to disk?

阅读更多关于 Does madvise(___, ___, MADV_DONTNEED) instruct the OS to lazily write to disk?

Hypothetically, suppose I want to perform sequential writing to a potentially very large file. If I mmap() a gigantic region and madvise(MADV_SEQUENTIAL) on that entire region, then I can write to the memory in a relatively efficient manner. This I have gotten to work just fine. Now, in order to free up various OS resources as I am writing, I occasionally perform a munmap() on small chunks of memory that have already been written to. My concern is that munmap() and msync()will block my thread, waiting for the data to be physically committed to disk. I cannot slow down my writer at all, so I

Create a shared-memory vector of strings

阅读更多关于 Create a shared-memory vector of strings

I am trying to create a class managing a shared-memory vector of (std)strings. typedef boost::interprocess::allocator<std::string, boost::interprocess::managed_shared_memory::segment_manager> shmem_allocator; typedef boost::interprocess::vector<std::string, shmem_allocator> shmem_vector; shmem_mgr::shmem_mgr() : shmem_(create_only, SHMEM_KEY, SHMEM_SIZE), allocator_(shmem_.get_segment_manager()) { mutex_ = shmem_.find_or_construct<interprocess_mutex>(SHMEM_MUTEX)(); condition_ = shmem_.find_or_construct<interprocess_condition>(SHMEM_CONDITION)(); //buffer_ is of type shmem_vector buffer_ =

R and shared memory for parallel::mclapply

阅读更多关于 R and shared memory for parallel::mclapply

I am trying to take advantage of a quad-core machine by parallelizing a costly operation that is performed on a list of about 1000 items. I am using R's parallel::mclapply function currently: res = rbind.fill(parallel::mclapply(lst, fun, mc.cores=3, mc.preschedule=T)) Which works. Problem is, any additional subprocess that is spawned has to allocate a large chunk of memory: Ideally, I would like each core to access shared memory from the parent R process, so that as I increase the number of cores used in mclapply, I don't hit RAM limitations before core limitations. I'm currently at a loss on

Mutex in shared memory when one user crashes?

阅读更多关于 Mutex in shared memory when one user crashes?

问题 Suppose that a process is creating a mutex in shared memory and locking it and dumps core while the mutex is locked. Now in another process how do I detect that mutex is already locked but not owned by any process? 回答1: If you're working in Linux or something similar, consider using named semaphores instead of (what I assume are) pthreads mutexes. I don't think there is a way to determine the locking PID of a pthreads mutex, short of building your own registration table and also putting it in

Dynamically create a list of shared arrays using python multiprocessing

阅读更多关于 Dynamically create a list of shared arrays using python multiprocessing

I'd like to share several numpy arrays between different child processes with python's multiprocessing module. I'd like the arrays to be separately lockable, and I'd like the number of arrays to be dynamically determined at runtime. Is this possible? In this answer , J.F. Sebastian lays out a nice way to use python's numpy arrays in shared memory while multiprocessing. The array is lockable, which is what I want. I would like to do something very similar, except with a variable number of shared arrays. The number of arrays would be determined at runtime. His example code is very clear and does

Fully managed shared memory .NET implementations? [closed]

阅读更多关于 Fully managed shared memory .NET implementations? [closed]

I'm looking for free, fully-managed implementations of shared memory for .NET (P/Invoke is acceptable, mixed C++/CLI is not). Sounds like you are looking for Memory-Mapped Files , which are supported in the .NET 4.0 BCL. Starting with the .NET Framework version 4, you can use managed code to access memory-mapped files in the same way that native Windows functions access memory-mapped files, as described in Managing Memory-Mapped Files in Win32 in the MSDN Library Well, the .NET framework is free, recommended. .NET 4.0 supports the System.IO.MemoryMappedFiles namespace classes. Shared memory is

what does it mean configuring MPI for shared memory?

阅读更多关于 what does it mean configuring MPI for shared memory?

问题 I have a bit of research related question. Currently I have finished implementation of structure skeleton frame work based on MPI (specifically using openmpi 6.3). the frame work is supposed to be used on single machine. now, I am comparing it with other previous skeleton implementations (such as scandium, fast-flow, ..) One thing I have noticed is that the performance of my implementation is not as good as the other implementations. I think this is because, my implementation is based on MPI

Does different process has seperate copy of Shared Static variable or common copy?

阅读更多关于 Does different process has seperate copy of Shared Static variable or common copy?

I am trying to understand the fundamental of shared memory concept. I trying to create a shared library having one function and one STATIC array variable. I want to access static array variable through the function of that shared library. Here is my shared library //foo.c #include <stdio.h> static int DATA[1024]={1 ,2 ,3 ,...., 1024}; inline void foo(void) { int j, k=0; for(j=0;j<1024;j++) { k=DATA[j]; } k+=0; } I have created shared library object (libfoo.so) by following instructions from shared library Now my questions are 1> If I access foo() from two different program ( program1 and

Performance difference between IPC shared memory and threads memory

阅读更多关于 Performance difference between IPC shared memory and threads memory

I hear frequently that accessing a shared memory segment between processes has no performance penalty compared to accessing process memory between threads. In other words, a multi-threaded application will not be faster than a set of processes using shared memory (excluding locking or other synchronization issues). But I have my doubts: 1) shmat() maps the local process virtual memory to the shared segment. This translation has to be performed for each shared memory address and can represent a significant cost. In a multi-threaded application there is no extra translation required: all VM