Deadlock with logging multiprocess/multithread python script

I am facing the problem with collecting logs from the following script. Once I set up the SLEEP_TIME to too "small" value, the LoggingThread threads somehow block the logging module. The script freeze on logging request in the action function. If the SLEEP_TIME is about 0.1 the script collect all log messages as I expect.

I tried to follow this answer but it does not solve my problem.

import multiprocessing
import threading
import logging
import time

SLEEP_TIME = 0.000001

logger = logging.getLogger()

ch = logging.StreamHandler()
ch.setFormatter(logging.Formatter('%(asctime)s %(levelname)s %(funcName)s(): %(message)s'))
ch.setLevel(logging.DEBUG)

logger.setLevel(logging.DEBUG)
logger.addHandler(ch)


class LoggingThread(threading.Thread):

    def __init__(self):
        threading.Thread.__init__(self)

    def run(self):
        while True:
            logger.debug('LoggingThread: {}'.format(self))
            time.sleep(SLEEP_TIME)


def action(i):
    logger.debug('action: {}'.format(i))


def do_parallel_job():

    processes = multiprocessing.cpu_count()
    pool = multiprocessing.Pool(processes=processes)
    for i in range(20):
        pool.apply_async(action, args=(i,))
    pool.close()
    pool.join()



if __name__ == '__main__':

    logger.debug('START')

    #
    # multithread part
    #
    for _ in range(10):
        lt = LoggingThread()
        lt.setDaemon(True)
        lt.start()

    #
    # multiprocess part
    #
    do_parallel_job()

    logger.debug('FINISH')

How to use logging module in multiprocess and multithread scripts?

This is probably bug 6721.

The problem is common in any situation where you have locks, threads and forks. If thread 1 had a lock while thread 2 calls fork, in the forked process, there will only be thread 2 and the lock will be held forever. In your case, that is logging.StreamHandler.lock.

A fix can be found here for the logging module. Note that you need to take care of any other locks, too.

I've run into similar issue just recently while using logging module together with Pathos multiprocessing library. Still not 100% sure, but it seems, that in my case the problem may have been caused by the fact, that logging handler was trying to reuse a lock object from within different processes.

Was able to fix it with a simple wrapper around default logging Handler:

import threading
from collections import defaultdict
from multiprocessing import current_process

import colorlog


class ProcessSafeHandler(colorlog.StreamHandler):
    def __init__(self):
        super().__init__()

        self._locks = defaultdict(lambda: threading.RLock())

    def acquire(self):
        current_process_id = current_process().pid
        self._locks[current_process_id].acquire()

    def release(self):
        current_process_id = current_process().pid
        self._locks[current_process_id].release()

来源：https://stackoverflow.com/questions/24509650/deadlock-with-logging-multiprocess-multithread-python-script

标签

python

multithreading

logging

multiprocess