What is the best way to periodically load data into table

雨燕双飞 提交于 2019-12-24 11:29:31

问题


I have a database with static tables which require to be updated from CSV weekly. Tables are Mysql MyISAM and by static i mean they are used for read only (except when updated from CVS, obviously).

There're about 50 tables and in total about 200mb of data to be reloaded weekly.

I can think about 3 ways:

  1. Truncate table
  2. Load data from files

Or

  1. For each table create a temporary table
  2. Load data there
  3. Truncate (or delete rows?) original table
  4. Insert into original table select * from temporary table.

Or

  1. Create table_new and load data there
  2. Rename original table to table_old (or drop table altogether)
  3. Rename table_new into original table

What do you reckon is the most efficient way?


回答1:


Have you considered using mysqlimport? You can read about it here: http://dev.mysql.com/doc/refman/5.1/en/mysqlimport.html

I probably wouldn't do anything with deleting the original tables, because then you have to re-create all your foreign keys, indexes, constraints, etc. which is a mess and a maintenance nightmare. Renaming tables can also cause problems (like if you have synonyms for the tables, I'm not sure if mysql has synonyms though).

What I would do, however, is disable the keys before loading the data.

ALTER TABLE tbl_name DISABLE KEYS 

In other words, when loading the data you don't want it to be trying to update indexes because that will slow down the load. You want the indexes updated once the load is completed.

So I think by combining mysqlimport with the tip above, you should be able to get a really efficient load.




回答2:


You could always do INSERT INTO ... ON DUPLICATE KEY UPDATE ... or REPLACE INTO .... You shouldn't get any down time (between a TRUNCATE and INSERT), and there's very little chance of corruption.

Be careful with REPLACE, since it will actually delete each record and re-insert it, firing any triggers you may have (unlikely in this case), but also giving you a new ID if you have an auto-increment field.




回答3:


Your third option is the best, you can LOCK and DISABLE KEYS on the _new table while importing, and it'll be extra quick. You can even do a "batch atomic rename" of all your new tables to the "current ones", with zero downtime if they have relations between them.

I'm assuming the whole tables are contained in the weekly cvs updates (i.e. they're not incremental).




回答4:


I would prefer the 3rd method and also keep the old table.

  1. create table_new
  2. drop table_old if exists
  3. rename table to table_old
  4. rename table_new to table

The advantage of this method is that it fast and safe with less effect on the readers. The creation of new table does not affect reads on existing table. The rename operation is faster (just a file rename in case of myisam) so the downtime is not that much. So the clients will not be affected by this that much. You also got to keep the old data in case something is wrong with the new data.

As you are not going to update it online I think it will be good if you do myisampack.



来源:https://stackoverflow.com/questions/1972944/what-is-the-best-way-to-periodically-load-data-into-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!