问题
I am working on a multi-threaded network server application. At the moment, I am having issues with lock recovery. If a thread dies unexpectedly while it is holding a lock, say a mutex, rwlock, spinlock, etc..., is it possible to recover the lock from a different thread without having to go into the lock struct itself and manually disassociate the owner from the lock. I would like to not have to go to this extreme to clear it as this will make the code non-portable. I have attempted to force a lock owner change by doing a pthread_kill on the offending thread and looking at the return code. But even using a mutex type attribute of PTHREAD_MUTEX_ERRORCHECK, I still cannot gain control of the mutex from another thread if the locking thread has quit. This can be a problem if some internal table is being updated when the thread bails out as it will eventually cause the entire server application to halt.
I have used Google extensively and I'm getting conflicting information, even on here. Any suggestions or ideas that I can explore?
This is on FreeBSD 9.3 using clang-llvm compiler.
回答1:
For mutexes which are shared between processes (PTHREAD_PROCESS_SHARED) you can set them PTHREAD_MUTEX_ROBUST... but you are stuck with the problem that the state protected by the mutex may be invalid -- depending on the application.
For mutexes which are not shared between processes, there is no standard notion of "robustness", because a thread cannot spontaneously die on its own -- a thread will run until either it is cancelled, it exits or the process exits or dies.
You can use:
void pthread_cleanup_push(void (*routine)(void*), void *arg);
void pthread_cleanup_pop(int execute);
to arrange for a mutex to be released if the thread is cancelled or exits while holding the mutex -- something like:
pthread_mutex_lock(&foo) ; // as now
pthread_cleanup_push(pthread_mutex_unlock, &foo) ; // extra step
....
pthread_cleanup_pop(true) ; // replacing the pthread_mutex_unlock()
HOWEVER: you still need to think very carefully about what state the data protected by the mutex is in when the thread is cancelled or exits !!
You may be much better off examining why the thread needs this, and perhaps sort out any error/exception handling to pass the error/exception up and out of the critical section (leaving the critical section cleanly).
来源:https://stackoverflow.com/questions/26229205/pthreads-lock-recovery