发表新帖

发表新帖

Python: Fast and efficient way of writing large text file

前端未结

关注

 3  1530

I have a speed/efficiency related question about python:

I need to write a large number of very large R dataframe-ish files, about 0.5-2 GB sizes. This is basically

相关标签:

3条回答

走了就别回头了

2020-12-11 07:06

I think what you might want to do is create a memory mapped file. Take a look at the following documentation to see how you can do this with numpy:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.memmap.html

0 讨论(0)
发布评论:

提交评论
- 加载中...
臣服心动

2020-12-11 07:19
Seems like Pandas might be a good tool for this problem. It's pretty easy to get started with pandas, and it deals well with most ways you might need to get data into python. Pandas deals well with mixed data (floats, ints, strings), and usually can detect the types on its own.

Once you have an (R-like) data frame in pandas, it's pretty straightforward to output the frame to csv.
```
DataFrame.to_csv(path_or_buf, sep='\t')
```
There's a bunch of other configuration things you can do to make your tab separated file just right.

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html
0 讨论(0)
发布评论:

提交评论
- 加载中...
一整个雨季

2020-12-11 07:21

Unless you are running into a performance issue, you can probably write to the file line by line. Python internally uses buffering and will likely give you a nice compromise between performance and memory efficiency.

Python buffering is different from OS buffering and you can specify how you want things buffered by setting the buffering argument to open.

0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题