Why would a timeout avoid a tornado hang?

久未见 提交于 2020-05-18 04:12:39

问题


Waiting on a concurrent.futures.Future from a ThreadPoolExecutor in a Tornado coroutine sometimes hangs for me:

def slowinc(x):
    time.sleep(0.1)
    return x + 1
yield tp_executor.submit(slowinc, 1)  # sometimes hangs

When I add a timeout it might hang for the timeout period, but seems to actually return the correct result after that time.

yield gen.with_timeout(timedelta(seconds=5), 
                       executor.submit(...))  # hangs for 5sec, then works

This only happens in the context of a larger test suite, in which many unpleasant things occur (child processes are terminated mid-stride). This error is almost certainly related to this unpleasant context and not strictly the fault of Tornado. However, I believe that I have isolated these unpleasant things as well as possible.

So I apologize for the subtle bug and lack of a simple repeatable example. My hope is that the odd behavior of a failing timeout actually causing success is helpful in isolating my issue.

Temporary Solution

So far my solution is the following:

while not future.done():
    try:
        yield gen.with_timeout(timedelta(seconds=1), future)
    except gen.TimeoutError:
        pass

result = future.result()

This solves my immediate problem and, other than the occasional one second delay, is perfectly serviceable. I'm still confused by the behavior though and am curious what odd things I'm doing to trigger this.

Update

The timeout solution above works for Python 3.4 under both Tornado 4.2 and 4.3 but neither the timeout solution nor the PeriodicCallback solution presented below in @ben-darnell 's answer resolves this problem under Python 2.7 Tornado 4.2 or 4.3.


回答1:


I don't have a complete solution but I think I can offer a simpler workaround: start up a background PeriodicCallback that does nothing in a short interval: PeriodicCallback(lambda: None, 500).start(). This will make sure the IOLoop wakes up periodically without intruding into all your yield executor.submit() calls.

The symptom suggests that the problem lies in the "waker" behavior of add_callback: https://github.com/tornadoweb/tornado/blob/d9c5bc8fb6530a03ebbb6da667e26685b8eee0ea/tornado/ioloop.py#L929-L944

This code was changed in Tornado 4.3 (https://github.com/tornadoweb/tornado/pull/1511/files). If you're on 4.3, see if the problem still exists in 4.2. Could anything in your "unpleasant" environment be causing thread.get_ident() to behave differently than tornado expects?

There are reports of (rare) problems with the waker "pipe" on windows: https://github.com/tornadoweb/tornado/pull/1364



来源:https://stackoverflow.com/questions/33634956/why-would-a-timeout-avoid-a-tornado-hang

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!