Using pandas to efficiently read in a large CSV file without crashing

后端未结

关注

 2  1044

野趣味 2020-12-17 05:55

I am trying to read a .csv file called ratings.csv from http://grouplens.org/datasets/movielens/20m/ the file is 533.4MB in my computer.

This is what am writing in j

2条回答

刺人心 (楼主)

2020-12-17 06:02

try like this - 1) load with dask and then 2) convert to pandas

import pandas as pd
import dask.dataframe as dd
import time
t=time.clock()
df_train = dd.read_csv('../data/train.csv')
df_train=df_train.compute()
print("load train: " , time.clock()-t)

0 讨论(0)

查看其它2个回答