“Large data” work flows using pandas

前端未结

关注

 16  2212

被撕碎了的回忆 2020-11-21 07:32

I have tried to puzzle out an answer to this question for many months while learning pandas. I use SAS for my day-to-day work and it is great for it\'s out-of-core support.

16条回答

深忆病人 (楼主)

2020-11-21 08:17

I'd like to point out the Vaex package.

Vaex is a python library for lazy Out-of-Core DataFrames (similar to Pandas), to visualize and explore big tabular datasets. It can calculate statistics such as mean, sum, count, standard deviation etc, on an N-dimensional grid up to a billion (10⁹) objects/rows per second. Visualization is done using histograms, density plots and 3d volume rendering, allowing interactive exploration of big data. Vaex uses memory mapping, zero memory copy policy and lazy computations for best performance (no memory wasted).

Have a look at the documentation: https://vaex.readthedocs.io/en/latest/ The API is very close to the API of pandas.

0 讨论(0)

查看其它16个回答
发布评论:

提交评论
- 加载中...