Handling very large data with mysql

前端 未结 2 833
离开以前
离开以前 2020-12-02 05:55

Sorry for the long post!

I have a database containing ~30 tables (InnoDB engine). Only two of these tables, namely, \"transaction\" and \"shift\" are quite large (th

2条回答
  •  我在风中等你
    2020-12-02 06:05

    • Can MySQL reasonably perform queries on billions of rows? -- MySQL can 'handle' billions of rows. "Reasonably" depends on the queries; let's see them.

    • Is InnoDB (MySQL 5.5.8) the right choice for multi-billion rows? -- 5.7 has some improvements, but 5.5 is pretty good, in spite of being nearly 6 8 years old, and on the verge of no longer being supported.

    • Best data store for billions of rows -- If you mean 'Engine', then InnoDB.

    • How big can a MySQL database get before performance starts to degrade -- Again, that depends on the queries. I can show you a 1K row table that will meltdown; I have worked with billion-row tables that hum along.

    • Why MySQL could be slow with large tables? -- range scans lead to I/O, which is the slow part.

    • Can Mysql handle tables which will hold about 300 million records? -- again, yes. The limit is somewhere around a trillion rows.

    • (for innoDB tables which is my case) increasing the innodb_buffer_pool_size (e.g., up to 80% of RAM). Also, I found some other MySQL performance tunning settings here in percona blog -- yes

    • having proper indexes on the table (using EXPLAN on queries) -- well, let's see them. There are lot of mistakes that can be made in this critical area.

    • partitioning the table -- "Partitioning is not a panacea!" I harp on that in my blog

    • MySQL Sharding -- Currently this is DIY

    • MySQL clustering -- Currently the best answer is some Galera-based option (PXC, MariaDB 10, DIY w/Oracle). Oracle's "Group Replication" is a viable contender.

    • Partitioning does not support FOREIGN KEY or "global" UNIQUE.

    • UUIDs, at the scale you are talking about, will not just slow down the system, but actually kill it. Type 1 UUIDs may be a workaround.

    • Insert and index-build speed -- There are too many variations to give a single answer. Let's see your tentative CREATE TABLE and how you intend to feed the data in.

    • Lots of joins -- "Normalize, but don't over-normalize." In particular, do not normalize datetimes or floats or other "continuous" values.

    • Do build summary tables

    • 2,3 million transaction per day -- If that is 2.3M inserts (30/sec), then there is not much of a performance problem. If more complex, then RAID, SSD, batching, etc, may be necessary.

    • deal with such volume of data -- If most activity is with the "recent" rows, then the buffer_pool will nicely 'cache' the activity, thereby avoiding I/O. If the activity is "random", then MySQL (or anyone else) will have I/O issues.

    • Shrinking the datatypes helps in a table like yours. I doubt if you need 4 bytes to specify fuel_type. There are multiple 1-byte approaches.

提交回复
热议问题