Process very big csv file without timeout and memory error

前端 未结 5 1338
终归单人心
终归单人心 2020-12-02 11:16

At the moment I\'m writing an import script for a very big CSV file. The Problem is most times it stops after a while because of an timeout or it throws an memory error.

5条回答
  •  情歌与酒
    2020-12-02 12:16

    I've used fgetcsv to read a 120MB csv in a stream-wise-manner (is that correct english?). That reads in line by line and then I've inserted every line into a database. That way only one line is hold in memory on each iteration. The script still needed 20 min. to run. Maybe I try Python next time… Don't try to load a huge csv-file into an array, that really would consume a lot of memory.

    // WDI_GDF_Data.csv (120.4MB) are the World Bank collection of development indicators:
    // http://data.worldbank.org/data-catalog/world-development-indicators
    if(($handle = fopen('WDI_GDF_Data.csv', 'r')) !== false)
    {
        // get the first row, which contains the column-titles (if necessary)
        $header = fgetcsv($handle);
    
        // loop through the file line-by-line
        while(($data = fgetcsv($handle)) !== false)
        {
            // resort/rewrite data and insert into DB here
            // try to use conditions sparingly here, as those will cause slow-performance
    
            // I don't know if this is really necessary, but it couldn't harm;
            // see also: http://php.net/manual/en/features.gc.php
            unset($data);
        }
        fclose($handle);
    }
    

提交回复
热议问题