Best practice: Import mySQL file in PHP; split queries

后端 未结 13 615
Happy的楠姐
Happy的楠姐 2020-11-29 00:08

I have a situation where I have to update a web site on a shared hosting provider. The site has a CMS. Uploading the CMS\'s files is pretty straightforward using FTP.

<
相关标签:
13条回答
  • 2020-11-29 00:32

    http://www.ozerov.de/bigdump/ was very useful for me in importing 200+ MB sql file.

    Note: SQL file should be already present in the server so that the process can be completed without any issue

    0 讨论(0)
  • 2020-11-29 00:38

    Here is a memory-friendly function that should be able to split a big file in individual queries without needing to open the whole file at once:

    function SplitSQL($file, $delimiter = ';')
    {
        set_time_limit(0);
    
        if (is_file($file) === true)
        {
            $file = fopen($file, 'r');
    
            if (is_resource($file) === true)
            {
                $query = array();
    
                while (feof($file) === false)
                {
                    $query[] = fgets($file);
    
                    if (preg_match('~' . preg_quote($delimiter, '~') . '\s*$~iS', end($query)) === 1)
                    {
                        $query = trim(implode('', $query));
    
                        if (mysql_query($query) === false)
                        {
                            echo '<h3>ERROR: ' . $query . '</h3>' . "\n";
                        }
    
                        else
                        {
                            echo '<h3>SUCCESS: ' . $query . '</h3>' . "\n";
                        }
    
                        while (ob_get_level() > 0)
                        {
                            ob_end_flush();
                        }
    
                        flush();
                    }
    
                    if (is_string($query) === true)
                    {
                        $query = array();
                    }
                }
    
                return fclose($file);
            }
        }
    
        return false;
    }
    

    I tested it on a big phpMyAdmin SQL dump and it worked just fine.


    Some test data:

    CREATE TABLE IF NOT EXISTS "test" (
        "id" INTEGER PRIMARY KEY AUTOINCREMENT,
        "name" TEXT,
        "description" TEXT
    );
    
    BEGIN;
        INSERT INTO "test" ("name", "description")
        VALUES (";;;", "something for you mind; body; soul");
    COMMIT;
    
    UPDATE "test"
        SET "name" = "; "
        WHERE "id" = 1;
    

    And the respective output:

    SUCCESS: CREATE TABLE IF NOT EXISTS "test" ( "id" INTEGER PRIMARY KEY AUTOINCREMENT, "name" TEXT, "description" TEXT );
    SUCCESS: BEGIN;
    SUCCESS: INSERT INTO "test" ("name", "description") VALUES (";;;", "something for you mind; body; soul");
    SUCCESS: COMMIT;
    SUCCESS: UPDATE "test" SET "name" = "; " WHERE "id" = 1;
    
    0 讨论(0)
  • 2020-11-29 00:41

    I ran into the same problem. I solved it using a regular expression:

    function splitQueryText($query) {
        // the regex needs a trailing semicolon
        $query = trim($query);
    
        if (substr($query, -1) != ";")
            $query .= ";";
    
        // i spent 3 days figuring out this line
        preg_match_all("/(?>[^;']|(''|(?>'([^']|\\')*[^\\\]')))+;/ixU", $query, $matches, PREG_SET_ORDER);
    
        $querySplit = "";
    
        foreach ($matches as $match) {
            // get rid of the trailing semicolon
            $querySplit[] = substr($match[0], 0, -1);
        }
    
        return $querySplit;
    }
    
    $queryList = splitQueryText($inputText);
    
    foreach ($queryList as $query) {
        $result = mysql_query($query);
    }
    
    0 讨论(0)
  • 2020-11-29 00:41

    Splitting a query cannot be reliably done without parsing. Here is valid SQL that would be impossible to split correctly with a regular expression.

    SELECT ";"; SELECT ";\"; a;";
    SELECT ";
        abc";
    

    I wrote a small SqlFormatter class in PHP that includes a query tokenizer. I added a splitQuery method to it that splits all queries (including the above example) reliably.

    https://github.com/jdorn/sql-formatter/blob/master/SqlFormatter.php

    You can remove the format and highlight methods if you don't need them.

    One downside is that it requires the whole sql string to be in memory, which could be a problem if you're working with huge sql files. I'm sure with a little bit of tinkering, you could make the getNextToken method work on a file pointer instead.

    0 讨论(0)
  • 2020-11-29 00:47

    what do you think about:

    system("cat xxx.sql | mysql -l username database"); 
    
    0 讨论(0)
  • 2020-11-29 00:48

    Export

    The first step is getting the input in a sane format for parsing when you export it. From your question it appears that you have control over the exporting of this data, but not the importing.

    ~: mysqldump test --opt --skip-extended-insert | grep -v '^--' | grep . > test.sql
    

    This dumps the test database excluding all comment lines and blank lines into test.sql. It also disables extended inserts, meaning there is one INSERT statement per line. This will help limit the memory usage during the import, but at a cost of import speed.

    Import

    The import script is as simple as this:

    <?php
    
    $mysqli = new mysqli('localhost', 'hobodave', 'p4ssw3rd', 'test');
    $handle = fopen('test.sql', 'rb');
    if ($handle) {
        while (!feof($handle)) {
            // This assumes you don't have a row that is > 1MB (1000000)
            // which is unlikely given the size of your DB
            // Note that it has a DIRECT effect on your scripts memory
            // usage.
            $buffer = stream_get_line($handle, 1000000, ";\n");
            $mysqli->query($buffer);
        }
    }
    echo "Peak MB: ",memory_get_peak_usage(true)/1024/1024;
    

    This will utilize an absurdly low amount of memory as shown below:

    daves-macbookpro:~ hobodave$ du -hs test.sql 
     15M    test.sql
    daves-macbookpro:~ hobodave$ time php import.php 
    Peak MB: 1.75
    real    2m55.619s
    user    0m4.998s
    sys 0m4.588s
    

    What that says is you processed a 15MB mysqldump with a peak RAM usage of 1.75 MB in just under 3 minutes.

    Alternate Export

    If you have a high enough memory_limit and this is too slow, you can try this using the following export:

    ~: mysqldump test --opt | grep -v '^--' | grep . > test.sql
    

    This will allow extended inserts, which insert multiple rows in a single query. Here are the statistics for the same datbase:

    daves-macbookpro:~ hobodave$ du -hs test.sql 
     11M    test.sql
    daves-macbookpro:~ hobodave$ time php import.php 
    Peak MB: 3.75
    real    0m23.878s
    user    0m0.110s
    sys 0m0.101s
    

    Notice that it uses over 2x the RAM at 3.75 MB, but takes about 1/6th as long. I suggest trying both methods and seeing which suits your needs.

    Edit:

    I was unable to get a newline to appear literally in any mysqldump output using any of CHAR, VARCHAR, BINARY, VARBINARY, and BLOB field types. If you do have BLOB/BINARY fields though then please use the following just in case:

    ~: mysqldump5 test --hex-blob --opt | grep -v '^--' | grep . > test.sql
    
    0 讨论(0)
提交回复
热议问题