Empty python process hangs on join [sys.stderr.flush()]

两盒软妹~` 提交于 2021-02-07 08:17:28

问题


Python guru I need your help. I faced quite strange behavior: empty python Process hangs on joining. Looks like it forks some locked resource.

Env:

  • Python version: 3.5.3
  • OS: Ubuntu 16.04.2 LTS
  • Kernel: 4.4.0-75-generic

Problem description:

1) I have a logger with thread to handle messages in background and queue for this thread. Logger source code (a little bit simplified).

2) And I have a simple script which uses my logger (just code to display my problem):

import os
from multiprocessing import Process
from my_logging import get_logger


def func():
    pass


if __name__ == '__main__':

    logger = get_logger(__name__)
    logger.start()
    for _ in range(2):
        logger.info('message')

    proc = Process(target=func)
    proc.start()
    proc.join(timeout=3)
    print('TEST PROCESS JOINED: is_alive={0}'.format(proc.is_alive()))

    logger.stop()
    print('EXIT')

Sometimes this test script hangs. Script hangs on joining process "proc" (when script completes execution). Test process "proc" stay alive.

To reproduce this problem you can run the script in loop:

$ for i in {1..100} ; do /opt/python3.5.3/bin/python3.5 test.py ; done

Investigation:

Strace shows following:

strace: Process 25273 attached
futex(0x2275550, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, ffffffff

And I figured out the place where process hangs. It hangs in multiprocessing module, file process.py, line 269 (python3.5.3), on flushing STDERR:

...
267    util.info('process exiting with exitcode %d' % exitcode)
268    sys.stdout.flush()
269    sys.stderr.flush()
...

If line 269 commented the script completes successfully always.

My thoughts:

By default logging.StreamHandler uses sys.stderr as stream.

If process has been forked when logger flushing data to STDERR, process context gets some locked resource and further hangs on flushing STDERR.

Some workarounds which solves problem:

  • Use python2.7. I can't reproduce it with python2.7. Maybe timings prevent me to reproduce the problem.
  • Use process to handle messages in logger instead of thread.

Do you have any ideas on this behavior? Where is the problem? Am I doing something wrong?


回答1:


It looks like this behaviour is related to this issue: http://bugs.python.org/issue6721




回答2:


Question: Sometimes ... Test process "proc" stay alive.

I could only reproduce your

TEST PROCESS:0 JOINED: is_alive=True

by adding a time.sleep(5) to def func():.
You use proc.join(timeout=3), that's the expected behavior.

Conclusion:
Overloading your System, starts in my Environment with 30 Processes running, triggers your proc.join(timeout=3). You may rethink your Testcase to reproduce your problem.

One Approach I think, is fine-tuning your Process/Thread with some time.sleep(0.05) to give off a timeslice.


  1. Your are using from multiprocessing import Queue use from queue import Queue instead.

    From the Documentation
    Class multiprocessing.Queue
    A queue class for use in a multi-processing (rather than multi-threading) context.

  2. In class QueueHandler(logging.Handler):, prevent to do

    self.queue.put_nowait(record)
    

    after

    class QueueListener(object):
    ...
    def stop(self):
        ...
    

    implement, for instance

    class QueueHandler(logging.Handler):
      def __init__(self):
          self.stop = Event()
          ...
    
  3. In def _monitor(self): use only ONE while ... loop.
    Wait until the self._thread stoped

    class QueueListener(object):
    ...
    def stop(self):
         self.handler.stop.set()
         while not self.queue.empty():
             time.sleep(0.5)
         # Don't use double flags
         #self._stop.set()
         self.queue.put_nowait(self._sentinel)
         self._thread.join()
    


来源:https://stackoverflow.com/questions/44069717/empty-python-process-hangs-on-join-sys-stderr-flush

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!