Connection problems with SQLAlchemy and multiple processes

前端未结

关注

 1  968

I\'m using PostgreSQL and SQLAlchemy in a project that consists of a main process which launches child processes. All of these processes access the database via SQLAlchemy.<

相关标签:

1条回答

野性不改

2020-12-03 00:37
Quoting "How do I use engines / connections / sessions with Python multiprocessing, or os.fork()?" with added emphasis:

The SQLAlchemy Engine object refers to a connection pool of existing database connections. So when this object is replicated to a child process, the goal is to ensure that no database connections are carried over.

and

However, for the case of a transaction-active Session or Connection being shared, there’s no automatic fix for this; an application needs to ensure a new child process only initiate new Connection objects and transactions, as well as ORM Session objects.

The issue stems from the forked child process inheriting the live global session, which is holding on to a Connection. When target calls init, it overwrites the global references to engine and session, thus decreasing their refcounts to 0 in the child, forcing them to finalize. If you for example one way or another create another reference to the inherited session in the child, you prevent it from being cleaned up – but don't do that. After main has joined and returns to business as usual it is trying to use the now potentially finalized – or otherwise out of sync – connection. As to why this causes an error only after some amount of iterations I'm not sure.

The only way to handle this situation using globals the way you do is to
1. Close all sessions
2. Call engine.dispose()
before forking. This will prevent connections from leaking to the child. For example:
```
def main():
    global session
    init()
    try:
        dummy = Dummy(value=1)
        session.add(dummy)
        session.commit()
        dummy_id = dummy.id
        # Return the Connection to the pool
        session.close()
        # Dispose of it!
        engine.dispose()
        # ...or call your cleanup() function, which does the same
        p = multiprocessing.Process(target=target, args=(dummy_id,))
        p.start()
        p.join()
        # Start a new session
        session = Session()
        dummy = session.query(Dummy).get(dummy_id)
        assert dummy.value == 2
    finally:
        cleanup()
```
Your second example does not trigger finalization in the child, and so it only seems to work, though it might be as broken as the first, as it is still inheriting a copy of the session and its connection defined locally in main.
0 讨论(0)
发布评论:

提交评论
- 加载中...