1 very large table or 3 large table? MySQL Performance

左心房为你撑大大i 提交于 2021-02-10 20:42:58

问题


Assume a very large database. A table with 900 million records.

Method A:
Table: Posts

+----------+-------------- +------------------+----------------+
| id (int) | item_id (int) | post_type (ENUM) | Content (TEXT) |
+----------+---------------+------------------+----------------+
|    1     |      1        |       user       |  some text ... |
+----------+---------------+------------------+----------------+
|    2     |      1        |       page       |  some text ... |
+----------+---------------+------------------+----------------+
|    3     |      1        |       group      |  some text ... |

// row 1 : User with ID 1 has a post with ID #1
// row 2 : Page with ID 1 has a post with ID #2
// row 3 : Group with ID 1 has a post with ID #3

The goal is displaying 20 records from all 3 post_types in a page.

SELECT * FROM posts LIMIT 20

But I am worried about number of records for this method

Method B:
Separate 900 million records to 3 tables with 300 millions for each one.

Table: User Posts

+----------+-------------- +----------------+
| id (int) | user_id (int) | Content (TEXT) |
+----------+---------------+----------------+
|    1     |      1        |  some text ... |
+----------+---------------+----------------+
|    2     |      2        |  some text ... |
+----------+---------------+----------------+
|    3     |      3        |  some text ... |

Table: Page Posts

+----------+-------------- +----------------+
| id (int) | page_id (int) | Content (TEXT) |
+----------+---------------+----------------+
|    1     |      1        |  some text ... |
+----------+---------------+----------------+
|    2     |      2        |  some text ... |
+----------+---------------+----------------+
|    3     |      3        |  some text ... |

Table: Group Posts

+----------+----------------+----------------+
| id (int) | group_id (int) | Content (TEXT) |
+----------+----------------+----------------+
|    1     |      1         |  some text ... |
+----------+----------------+----------------+
|    2     |      2         |  some text ... |
+----------+----------------+----------------+
|    3     |      3         |  some text ... |

now to get a list of 20 posts to display

SELECT * FROM User_Posts LIMIT 10
SELECT * FROM Page_Posts LIMIT 10
SELECT * FROM group_posts LIMIT 10

// and make an array or object of result. and display in output.

In this method, I should sort them in an array in php, and then semd them to page.

Which method is preferred?
Separating a 900 million records table to three tables will affect on speed of reading and writing in mysql?


回答1:


This is actually a discussion about Singe - Table - Inheritance vs. Table Per Class Inheritance and missing out joined inheritance. The former is related to Method A, the second to your Method B and Method C would be as having all IDs of your posts in one table and deferring specific attributes for group or user - posts ijto different tables. Whilst having a big sized table always has its negativ impacts related to table full scans the approach of splitting tables has it's own , too. It depends on how often your application needs to access the whole list of posts vs. only retrieving certain post types. Another consideration you should take into account is data partitioning which can be done with MySQL or Oracle Database e.g. which is a way of organizing your data within tables given opportunities for information lifecycle (which data is accessed when and how often, can part of it be moved and compressed reducing database size and increasing the speed for accessing the left part of the data in the table), which is basically split into three major techniques: Range based partitioning, list based partitioning and hash based partitioning. Other features not so commonly supported related to reducing table sizes are the ones dealing with insert's with timestamp invalidating the inserted data automatically after a certain timeperiod has expired. What indeed is a major application design decision and can boost performance is to distinguish between read and writeaccesses to the database at application level. Consider a MySQL - Backend: Because writeaccesses are obviously more critical to database performance then read accesses you could setup a MySQL - Instance for writing to the database and another one as replicant of this for the readaccesses, though this is also discussable, mainly when it comes to RDT (real time decisions), where absolute consistency of data at any given time is a must. Using object pools as a layer between your application and the database also is a technique to improve application performance though I don't know of existing solutions in the PHP world yet. Oracle Hot Cache is a pretty sophisticated example of it. You could build your own one implemented on top of a in - memory database or using memcache, though.



来源:https://stackoverflow.com/questions/20976146/1-very-large-table-or-3-large-table-mysql-performance

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!