Read large data from csv file in php [duplicate]

僤鯓⒐⒋嵵緔 提交于 2019-11-26 09:55:10

问题


I am reading csv & checking with mysql that records are present in my table or not in php.

csv has near about 25000 records & when i run my code it display \"Service Unavailable\" error after 2m 10s (onload: 2m 10s)

here i have added code

// for set memory limit & execution time
ini_set(\'memory_limit\', \'512M\');
ini_set(\'max_execution_time\', \'180\');

//function to read csv file
function readCSV($csvFile)
{
    $file_handle = fopen($csvFile, \'r\');
    while (!feof($file_handle) ) {

       set_time_limit(60); // you can enable this if you have lot of data

       $line_of_text[] = fgetcsv($file_handle, 1024);
   }
   fclose($file_handle);
   return $line_of_text;
 }

// Set path to CSV file
$csvFile = \'my_records.csv\';

$csv = readCSV($csvFile);

for($i=1;$i<count($csv);$i++)
{
   $user_email= $csv[$i][1];

   $qry = \"SELECT u.user_id, u.user_email_id FROM tbl_user as u WHERE u.user_email_id = \'\".$user_email.\"\'\";

   $result = @mysql_query($qry) or die(\"Couldn\'t execute query:\".mysql_error().\'\'.mysql_errno());

   $rec = @mysql_fetch_row($result);

   if($rec)
   {
      echo \"Record exist\";
   }
   else
   {
      echo \"Record not exist\"; 
   }
}

Note: I just want to list out records those are not exist in my table.

Please suggest me solution on this...


回答1:


An excellent method to deal with large files is located at: https://stackoverflow.com/a/5249971/797620

This method is used at http://www.cuddlycactus.com/knownpasswords/ (page has been taken down) to search through 170+ million passwords in just a few milliseconds.




回答2:


After struggling a lot, finally i found a good solution, may be it help others also. When i tried 2,367KB csv file containing 18226 rows, the least time taken by different php scripts were (1) from php.net fgetcsv documentation named CsvImporter, and (2) file_get_contents => PHP Fatal error: Allowed memory exhausted

(1) took 0.92574405670166 (2) took 0.12543702125549 (string form) & 0.52903485298157 (splitted to array) Note: this calculation not include adding to mysql.

The best solution i found uses 3.0644409656525 total including adding to database and some conditional check also. It took 11 seconds in processing a 8MB file. solution is :

$csvInfo = analyse_file($file, 5);
    $lineSeperator = $csvInfo['line_ending']['value'];
    $fieldSeperator = $csvInfo['delimiter']['value'];
    $columns = getColumns($file);
    echo '<br>========Details========<br>';
    echo 'Line Sep: \t '.$lineSeperator;
    echo '<br>Field Sep:\t '.$fieldSeperator;
    echo '<br>Columns: ';print_r($columns);
    echo '<br>========Details========<br>';
    $ext = pathinfo($file, PATHINFO_EXTENSION);
    $table = str_replace(' ', '_', basename($file, "." . $ext));
    $rslt = table_insert($table, $columns);
    if($rslt){
        $query = "LOAD DATA LOCAL INFILE '".$file."' INTO TABLE $table FIELDS TERMINATED BY '$fieldSeperator' ";

        var_dump(addToDb($query, false));
    }


function addToDb($query, $getRec = true){
//echo '<br>Query : '.$query;
$con = @mysql_connect('localhost', 'root', '');
@mysql_select_db('rtest', $con);
$result = mysql_query($query, $con);
if($result){
    if($getRec){
         $data = array();
        while ($row = mysql_fetch_assoc($result)) { 
            $data[] = $row;
        }
        return $data;
    }else return true;
}else{
    var_dump(mysql_error());
    return false;
}
}


function table_insert($table_name, $table_columns) {
    $queryString = "CREATE TABLE " . $table_name . " (";
    $columns = '';
    $values = '';

    foreach ($table_columns as $column) {
        $values .= (strtolower(str_replace(' ', '_', $column))) . " VARCHAR(2048), ";
    }
    $values = substr($values, 0, strlen($values) - 2);

    $queryString .= $values . ") ";

    //// echo $queryString;

    return addToDb($queryString, false);
}


function getColumns($file){
    $cols = array();
    if (($handle = fopen($file, 'r')) !== FALSE)
    {
        while (($row = fgetcsv($handle)) !== FALSE) 
        {
           $cols = $row;
           if(count($cols)>0){
                break;
           }
        }
        return $cols;
    }else return false;
}

function analyse_file($file, $capture_limit_in_kb = 10) {
// capture starting memory usage
$output['peak_mem']['start']    = memory_get_peak_usage(true);

// log the limit how much of the file was sampled (in Kb)
$output['read_kb']                 = $capture_limit_in_kb;

// read in file
$fh = fopen($file, 'r');
    $contents = fread($fh, ($capture_limit_in_kb * 1024)); // in KB
fclose($fh);

// specify allowed field delimiters
$delimiters = array(
    'comma'     => ',',
    'semicolon' => ';',
    'tab'         => "\t",
    'pipe'         => '|',
    'colon'     => ':'
);

// specify allowed line endings
$line_endings = array(
    'rn'         => "\r\n",
    'n'         => "\n",
    'r'         => "\r",
    'nr'         => "\n\r"
);

// loop and count each line ending instance
foreach ($line_endings as $key => $value) {
    $line_result[$key] = substr_count($contents, $value);
}

// sort by largest array value
asort($line_result);

// log to output array
$output['line_ending']['results']     = $line_result;
$output['line_ending']['count']     = end($line_result);
$output['line_ending']['key']         = key($line_result);
$output['line_ending']['value']     = $line_endings[$output['line_ending']['key']];
$lines = explode($output['line_ending']['value'], $contents);

// remove last line of array, as this maybe incomplete?
array_pop($lines);

// create a string from the legal lines
$complete_lines = implode(' ', $lines);

// log statistics to output array
$output['lines']['count']     = count($lines);
$output['lines']['length']     = strlen($complete_lines);

// loop and count each delimiter instance
foreach ($delimiters as $delimiter_key => $delimiter) {
    $delimiter_result[$delimiter_key] = substr_count($complete_lines, $delimiter);
}

// sort by largest array value
asort($delimiter_result);

// log statistics to output array with largest counts as the value
$output['delimiter']['results']     = $delimiter_result;
$output['delimiter']['count']         = end($delimiter_result);
$output['delimiter']['key']         = key($delimiter_result);
$output['delimiter']['value']         = $delimiters[$output['delimiter']['key']];

// capture ending memory usage
$output['peak_mem']['end'] = memory_get_peak_usage(true);
return $output;
}



回答3:


Normally, "Service Unavailable" error will come when 500 error occurs. I think this is coming because of insufficient execution time. Please check your log/browser console, may be you can see 500 error.

First of all, Keep set_time_limit(60) out of loop.

Do some changes like,

  1. Apply INDEX on user_email_id column, so you can get the rows faster with your select query.
  2. Do not echo message, Keep the output buffer free.

And

I have done these kind of take using Open source program. You can get it here http://sourceforge.net/projects/phpexcelreader/

Try this.



来源:https://stackoverflow.com/questions/17520093/read-large-data-from-csv-file-in-php

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!