pandas and numpy thread safety

前端 未结 2 672
滥情空心
滥情空心 2020-12-11 02:51

I\'m using pandas on a web server (apache + modwsgi + django) and have an hard-to-reproduce bug which now I discovered is caused by pandas not being thread-safe

相关标签:
2条回答
  • 2020-12-11 03:20

    Configure mod_wsgi to run in a single thread mode.

    WSGIDaemonProcess mysite processes=5 threads=1
    WSGIProcessGroup mysite
    WSGIApplicationGroup %{GLOBAL}
    

    In this case it is using mod_wsgi daemon mode so that processes/threads can be set independently on whatever Apache MPM you are using.

    0 讨论(0)
  • 2020-12-11 03:21

    see caveat in the docs here: http://pandas.pydata.org/pandas-docs/dev/gotchas.html#thread-safety

    pandas is not thread safe because the underlying copy mechanism is not. Numpy I believe has an atomic copy operation, but pandas has a layer above this.

    Copy is the basis of pandas operations (as most operations generate a new object to return to the user)

    It is not trivial to fix this and would come with a pretty heavy perf cost so would need a bit of work to deal with this properly.

    Easiest is simply not to share objects across threads or lock them on usage.

    0 讨论(0)
提交回复
热议问题