Segmentation fault error in a multi threaded app in python

巧了我就是萌 提交于 2019-12-09 03:58:20

问题


I have a multi threaded app in python, wherein I create multiple producer threads and they extract the data from DB. Data is extracted in chunks. So the part where a thread creates sql statement with limit values is kept within lock. And to let threads execute queries simultaneously, query() function is kept outside the lock. Then the result fetching part is again kept under the lock. Below is the code snippet:

with UserAgent.lock:
    sqlGeoTarget = "call sp_ax_ari_select_user_agent_list('0'," + str(self.chunkStart) + "," + str(self.chunkSize) + ",1);"
    self.chunkStart += self.chunkSize

self.dbObj.query(sqlGeoTarget)
print "query executed. Processing data now..."+sqlGeoTarget

with UserAgent.lock:
    result = self.dbObj.fetchAll()
    self.dbObj.dbCursor.close()

But this code generates fatal error segmentation fault (core dumped). Because if I put all the code under lock, it executes fine. I explicitly close the cursor after fetching the data, it is reopened when query() function fired again.

This code is inside a class named UserAgent and it's a shared resource for a class named Producer. Thus, database object is shared. So the problem area 99% must be that as the db object is shared hitting query simultaneously and closing cursor then must be messing up with result set. But then how to solve this problem and achieve concurrent db query execution?


回答1:


Do not reuse connections across threads. Create a new connection for each thread instead.

From the MySQLdb User Guide:

The MySQL protocol can not handle multiple threads using the same connection at once. Some earlier versions of MySQLdb utilized locking to achieve a threadsafety of 2. While this is not terribly hard to accomplish using the standard Cursor class (which uses mysql_store_result()), it is complicated by SSCursor (which uses mysql_use_result(); with the latter you must ensure all the rows have been read before another query can be executed. It is further complicated by the addition of transactions, since transactions start when a cursor execute a query, but end when COMMIT or ROLLBACK is executed by the Connection object. Two threads simply cannot share a connection while a transaction is in progress, in addition to not being able to share it during query execution. This excessively complicated the code to the point where it just isn't worth it.

The general upshot of this is: Don't share connections between threads. It's really not worth your effort or mine, and in the end, will probably hurt performance, since the MySQL server runs a separate thread for each connection. You can certainly do things like cache connections in a pool, and give those connections to one thread at a time. If you let two threads use a connection simultaneously, the MySQL client library will probably upchuck and die. You have been warned.

Emphasis mine.

Use thread local storage or a dedicated connection pooling library instead.



来源:https://stackoverflow.com/questions/13604338/segmentation-fault-error-in-a-multi-threaded-app-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!