In what situation do you use a semaphore over a mutex in C++?

≡放荡痞女 提交于 2019-11-27 19:58:01

问题


Throughout the resources I've read about multithreading, mutex is more often used and discussed compared to a semaphore. My question is when do you use a semaphore over a mutex? I don't see semaphores in Boost thread. Does that mean semaphores no longer used much these days?

As far as I've understand, semaphores allow a resource to be shared by several threads. This is only possible if those threads are only reading the resource but not writing. Is this correct?


回答1:


Boost.Thread has mutexes and condition variables. Purely in terms of functionality, semaphores are therefore redundant[*], although I don't know if that's why they're omitted.

Semaphores are a more basic primitive, simpler, and possibly implemented to be faster, but don't have priority-inversion avoidance. They're arguably harder to use than condition variables, because they require the client code to ensure that the number of posts "matches" the number of waits in some appropriate way. With condition variables it's easy to tolerate spurious posts, because nobody actually does anything without checking the condition.

Read vs. write resources is a red herring IMO, it has nothing to do with the difference between a mutex and a semaphore. If you use a counting semaphore, you could have a situation where multiple threads are concurrently accessing the same resource, in which case it would presumably have to be read-only access. In that situation, you might be able to use shared_mutex from Boost.Thread instead. But semaphores aren't "for" protecting resources in the way mutexes are, they're "for" sending a signal from one thread to another. It's possible to use them to control access to a resource.

That doesn't mean that all uses of semaphores must relate to read-only resources. For example, you can use a binary semaphore to protect a read/write resource. Might not be a good idea, though, since a mutex often gives you better scheduling behaviour.

[*] Here's roughly how you implement a counting semaphore using a mutex and a condition variable. To implement a shared semaphore of course you need a shared mutex/condvar:

struct sem {
    mutex m;
    condvar cv;
    unsigned int count;
};

sem_init(s, value)
    mutex_init(s.m);
    condvar_init(s.cv);
    count = value;

sem_wait(s)
    mutex_lock(s.m);
    while (s.count <= 0) {
        condvar_wait(s.cv, s.m);
    }
    --s.count;
    mutex_unlock(s.m);

sem_post(s)
    mutex_lock(s.m);
    ++s.count;
    condvar_broadcast(s.cv)
    mutex_unlock(s.m);

Therefore, anything you can do with semaphores, you can do with mutexes and condition variables. Not necessarily by actually implementing a semaphore, though.




回答2:


The typical use case for a mutex (allowing only one thread access to a resource at any time) is far more common than the typical uses if a semaphore. But a semaphore is actually the more general concept: A mutex is (almost) a special case of a semaphore.

Typical applications would be: You don't want to create more than (e.g.) 5 database connections. No matter how many worker threads there are, they have to share these 5 connections. Or, if you run on a N-core machine, you might want to make sure that certain CPU/memory-intensive tasks don't run in more than N threads at the same time (because that would only reduce throughput due to context switches and cache thrashing effects). You might even want to limit the number of parallel CPU/memory intensive tasks to N-1, so the rest of the system doesn't starve. Or imagine a certain task needs a lot of memory, so running more than N instances of that task at the same time would lead to paging. You could use a semaphore here, to make sure that no more than N instances of this particular task run at the same time.

EDIT/PS: From your question "This is only possible if those threads are only reading the resource but not writing. Is this correct?" and your comment, it seems to me as if you're thinking of a resource as a variable or a stream, that can be read or written and that can only be written to by one thread at a time. Don't. This is misleading in this context.

Think of resources like "water". You can use water to wash your dishes. I can use water to wash my dishes at the same time. We don't need any kind of synchronization for that, because there is enough water for both of us. We don't necessarily use the same water. (And you can't "read" or "write" water.) But the total amount of water is finite. So it's not possible for any number of parties to wash their dishes at the same time. This kind of synchronization is done with a semaphore. Only usually not with water but with other finite resources like memory, disk space, IO throughput or CPU cores.




回答3:


The essence of the difference between a mutex and a semaphore has to do with the concept of ownership. When a mutex is taken, we think of that thread as owning the mutex and that same thread must later release the mutex back to release the resource.

For a semaphore, think of taking the semaphore as consuming the resource, but not actually taking ownership of it. This is generally referred to as the semaphore being "empty" rather than owned by a thread. The feature of the semaphore is then that a different thread can "fill" the semaphore back to "full" state.

Therefore, mutexes are usually used for the concurrency protection of resources (ie: MUTual EXlusion) while semaphores are used for signaling between threads (like semaphore flags signaling between ships). A mutex by itself can't really be used for signaling, but semaphores can. So, selecting one over the other depends on what you are trying to do.

See another one of my answers here for more discussion on a related topic covering the distinction between recursive and non-recursive mutexes.




回答4:


To control access to a limited number of resources being shared by multiple threads (either inter- or intra-process).

In our application, we had a very heavy resource and that we did not want to allocate one for each of the M worker threads. Since a worker thread needed the resource for just one small part of their job, we rarely were using more then a couple of the resources simultaneously.

So, we allocated N of those resources and put them behind a semaphore initialized to N. When more then N threads were trying to use the resource, they would just block until one was available.




回答5:


I feel like there is no simple way to REALLY answer your question without disregarding some important information about semaphores. People have written many books about semaphores, so any one or two paragraph answer is a disservice. A popular book is The Little Book of Semaphores... for those who don't like big books :).

Here is a decent lengthy article which goes into a LOT of the details on how semaphores are used and how they're intended to be used.

Update:
Dan pointed out some mistakes in my examples, I'll leave it with the references which offer MUCH better explanations than mine :).

Here are the references showing the RIGHT ways one should use a semaphore:
1. IBM Article
2. University of Chicago Class Lecture
3. The Netrino article I originally posted.
4. The "sell tickets" paper + code.




回答6:


As taken from this article:

A mutex allows inter-process synchronisation to occur. If you instantiate a mutex with a name (as in the code above), the mutex becomes system-wide. This is really useful if you're sharing the same library between many different applications and need to block access to a critical section of code that is accessing resources that can't be shared.

Finally, the Semaphore class. Let's say you have a method that is really CPU intensive, and also makes use of resources that you need to control access to (using Mutexes :)). You've also determined that a maximum of five calls to the method is about all your machine can hanlde without making it unresponsive. Your best solution here is to make use of the Semaphore class which allows you to limit a certain number of threads' access to a resource.




回答7:


As far as I understand semaphores is a strongly IPC-related term these days. It still means protected variable many processes can modify, but among processes and this feature is supported by OS.

Usually, we don't need a variable and a simple mutex cover all our requirements. If we still need a variable, probably, we code it ourselves - "variable + mutex" to get more control.

Resume: we don't use semaphores in multithreading because usually use mutex for simplicity and control, and we use semaphores for IPC because it's OS-supported and an official name for processes synchronization mechanism.




回答8:


Semaphores was conceived originally for synchronization across processes. Windows uses WaitForMultipleObjects that is like a semaphore. In linux world, the initial pthread implementation did not allow a mutex to be shared across process. Now they do. The concept of breaking the atomic increment (interlocked increment in Windows) along with light weight mutex is most practical implementation these days after threads became the unit of scheduling for cpu. If the increment and the lock were together (semaphore), the time to acquire / release locks will be too long and we cannot split those 2 unit functions as we do today for performance and better synchronization constructs.




回答9:


From what I learned about semaphores and mutex's in college, semaphore are more theoretical objects while mutex's are one implementation of semaphores. Taking that into account, semaphores are more flexible.

Mutex's are highly implementation dependent. They have been optimized for their binary locking purpose. The normal use case of a mutex is a binary semaphore.

In general, when trying to write bug-free multithreaded code, simplicity helps. Mutex's are used more because their simplicity helps avoid complex deadlock scenarios that arise from using semaphores.



来源:https://stackoverflow.com/questions/2350544/in-what-situation-do-you-use-a-semaphore-over-a-mutex-in-c

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!