How to improve the speed of insertion of the csv data in a database in php?

爱⌒轻易说出口 提交于 2020-08-10 18:59:12

问题


I wrote this code that allows to read a folder, depending on the file name is inserted in a different table the data of the csv file. Once the file is processed it is moved to another folder. My code works well but the second csv file is more than 80 000 lines long and it takes several hours to be integrated in my database.

How can I improve the performance of my code? I tried 'LOAD DATA LOCAL' but without success...

<?php

include("../connexion.php");

ini_set('max_execution_time', 54000);
$timestamp= date("y-m-d H:i");

$dir   = 'D:/xampp/htdocs/retail_BI/test/';
$allFiles = scandir($dir);
$dest = 'D:/xampp/htdocs/retail_BI/test/files/';

   foreach($allFiles as $file) {

        if (!in_array($file,array(".","..")))
      { 
  
        $file = $dir.$file;
        $filename = basename( $file );
        
       if ( strpos( $filename, 'BI1_' ) === 0 ) {
        if (($handle = fopen("$filename", "r")) !== false) {
            
            //To remove BOM in the fist cell
              fseek($handle, 3);
            
            while (($data = fgetcsv($handle, 9000000, ";")) !== false) {
                if (empty(array_filter($data))) {
                    echo "not good";
                continue;
                }
 
            $date = DateTime::createFromFormat('d/m/Y H:i:s A', $data[2]);
            if ($date === false) {
            break;
            }

            $date1 = $date->format('Y-m-d'); // 2020-07-07
            $date2 = $date->format('Hi A'); // 1247 AM
            
              //database entry               
              $query = "insert into dbo.Y2_Sales (storenumber, storename, date, time, TransRef, stylecode, color, size, quantity, unit_price, SalesExGST, cost, currency) 
               values('$data[0]', '$data[1]','$date1','$date2','$data[3]','$data[4]','$data[5]','$data[6]','$data[7]', '$data[8]','$data[9]','$data[10]','$data[11]')";
                   $stmt = $conn->query( $query );
                     if (!$stmt) { echo $conn->error;}                                          
                }       
            }
                fclose($handle);
                
            //Moving the file to another folder             
            if(!rename($file, $dest . $filename)) { 
                echo "error copy";
                } 
                    
           } else if ( strpos( $filename, 'BI2_' ) === 0 ) {
    
              if (($handle = fopen("$filename", "r")) !== false) {
                  
                 //To remove BOM in the fist cell
                  fseek($handle, 3);
            
            while (($data = fgetcsv($handle, 9000000, ";")) !== false) {        
                 // print_r($data);
                if (empty(array_filter($data))) {
                continue;
                }
                 //database entry    
               $query = "insert into dbo.Y2_Inventory (storenumber, stylecode, colour, size, units, timestamp) 
               values('$data[0]', '$data[1]','$data[2]','$data[3]','$data[4]', '$timestamp')";
                   $stmt = $conn->query( $query );
                   if (!$stmt) { echo $conn->error;}
                               
                        // $query = "LOAD DATA LOCAL INFILE '$filename' INTO TABLE dbo.Y2_Inventory FIELDS TERMINATED BY ';' LINES TERMINATED BY '\n' (storenumber, stylecode, colour, size, units, timestamp)";
                         // $stmt = $conn->query( $query );
            }               
         }
            fclose($handle);

            //Moving the file to another folder             
            if(!rename($file, $dest . $filename)) { 
                            echo "error copy";
                            } 

       } 
    }
  }
  
    echo "good !";

?>

回答1:


Instead of inserting data into database for every row, try inserting in batches.

You can always do a bulk insert, that can take n(use 1000) number of entries and insert it into the table.

https://www.mysqltutorial.org/mysql-insert-multiple-rows/

This will result in reduction of the DB calls, thereby reducing the overall time.

And for 80k entries there is a possibility that you might exceed the memory limit too.

You can overcome that using generators in php. https://medium.com/@aashish.gaba097/database-seeding-with-large-files-in-laravel-be5b2aceaa0b

Although, this is in Laravel, but the code that reads from csv is independent (the one that uses generator) and the logic can be used here.



来源:https://stackoverflow.com/questions/63057309/how-to-improve-the-speed-of-insertion-of-the-csv-data-in-a-database-in-php

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!