How to improve INSERT performance on a very large MySQL table

后端 未结 4 631
刺人心
刺人心 2020-12-17 21:14

I am working on a large MySQL database and I need to improve INSERT performance on a specific table. This one contains about 200 Millions rows and its structure is as follow

相关标签:
4条回答
  • 2020-12-17 21:45

    There is a piece of documentation I would like to point out, Speed of INSERT Statements.

    0 讨论(0)
  • 2020-12-17 21:54

    You can use the following methods to speed up inserts:

    1. If you are inserting many rows from the same client at the same time, use INSERT statements with multiple VALUES lists to insert several rows at a time. This is considerably faster (many times faster in some cases) than using separate single-row INSERT statements. If you are adding data to a nonempty table, you can tune the bulk_insert_buffer_size variable to make data insertion even faster.

    2. When loading a table from a text file, use LOAD DATA INFILE. This is usually 20 times faster than using INSERT statements.

    3. Take advantage of the fact that columns have default values. Insert values explicitly only when the value to be inserted differs from the default. This reduces the parsing that MySQL must do and improves the insert speed.

    Reference: MySQL.com: 8.2.4.1 Optimizing INSERT Statements

    0 讨论(0)
  • 2020-12-17 21:59

    You could use

    load data local infile ''
    REPLACE
    into table 
    

    etc...

    The REPLACE ensure that any duplicate value is overwritten with the new values. Add a SET updated_at=now() at the end and you're done.

    There is no need for the temporary table.

    0 讨论(0)
  • 2020-12-17 22:09

    Your linear key on name and the large indexes slows things down.

    LINEAR KEY needs to be calculated every insert. http://dev.mysql.com/doc/refman/5.1/en/partitioning-linear-hash.html

    can you show us some example data of file_to_process.csv maybe a better schema should be build.

    Edit looked more closely

    INSERT INTO items (name, key, busy, created_at, updated_at) 
    (
        SELECT temp_items.name, temp_items.key, temp_items.busy, temp_items.created_at, temp_items.updated_at 
        FROM temp_items
    ) 
    

    this will proberly will create a disk temp table, this is very very slow so you should not use it to get more performance or maybe you should check some mysql config settings like tmp-table-size and max-heap-table-size maybe these are misconfigured.

    0 讨论(0)
提交回复
热议问题