Connection problems with SQLAlchemy and multiple processes

前端 未结 1 964
北荒
北荒 2020-12-02 23:44

I\'m using PostgreSQL and SQLAlchemy in a project that consists of a main process which launches child processes. All of these processes access the database via SQLAlchemy.<

相关标签:
1条回答
  • 2020-12-03 00:37

    Quoting "How do I use engines / connections / sessions with Python multiprocessing, or os.fork()?" with added emphasis:

    The SQLAlchemy Engine object refers to a connection pool of existing database connections. So when this object is replicated to a child process, the goal is to ensure that no database connections are carried over.

    and

    However, for the case of a transaction-active Session or Connection being shared, there’s no automatic fix for this; an application needs to ensure a new child process only initiate new Connection objects and transactions, as well as ORM Session objects.

    The issue stems from the forked child process inheriting the live global session, which is holding on to a Connection. When target calls init, it overwrites the global references to engine and session, thus decreasing their refcounts to 0 in the child, forcing them to finalize. If you for example one way or another create another reference to the inherited session in the child, you prevent it from being cleaned up – but don't do that. After main has joined and returns to business as usual it is trying to use the now potentially finalized – or otherwise out of sync – connection. As to why this causes an error only after some amount of iterations I'm not sure.

    The only way to handle this situation using globals the way you do is to

    1. Close all sessions
    2. Call engine.dispose()

    before forking. This will prevent connections from leaking to the child. For example:

    def main():
        global session
        init()
        try:
            dummy = Dummy(value=1)
            session.add(dummy)
            session.commit()
            dummy_id = dummy.id
            # Return the Connection to the pool
            session.close()
            # Dispose of it!
            engine.dispose()
            # ...or call your cleanup() function, which does the same
            p = multiprocessing.Process(target=target, args=(dummy_id,))
            p.start()
            p.join()
            # Start a new session
            session = Session()
            dummy = session.query(Dummy).get(dummy_id)
            assert dummy.value == 2
        finally:
            cleanup()
    

    Your second example does not trigger finalization in the child, and so it only seems to work, though it might be as broken as the first, as it is still inheriting a copy of the session and its connection defined locally in main.

    0 讨论(0)
提交回复
热议问题