Why did changing from utf8 to utf8mb4 slow down my database?

你说的曾经没有我的故事 提交于 2021-01-02 08:23:40

问题


All the MySQL tables in my PHP web application are MyISAM with utf8 encoding. Since records can be generated from a companion app while it's offline, my table keys are randomly generated, alphanumeric VARCHARs; these fields are set to binary with utf8_bin encoding so they can be case-sensitive.

I recently decided to change the encoding of all my text fields, to support emojis that some users like to enter. I went ahead and changed all utf8 fields to utf8mb4, including the keys. I immediately started seeing performance issues, where complex SELECT queries on one of the larger tables took more than a minute, and then other queries queued up waiting for table locks. I changed the encoding of the primary key field on that table back to utf8, and performance returned to normal. A couple days later, I changed that one field to utf8mb4 again, the queries started queueing up again, and I changed it back to restore the normal performance.

So this runs smoothly:

`ID` varchar(8) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT ''

But this causes problems:

`ID` varchar(8) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL DEFAULT ''

Everything I've read says utf8 and utf8mb4 should have equivalent performance, but I'm seeing a distinct difference in my case. Does this make sense?

It's not really a problem to keep the key fields at utf8, since I don't foresee ever using more than simple alphanumeric characters there. But I would have liked to have all the fields set to the same encoding just for consistency and simplicity of maintenance (don't have to remember to set user-populated fields to one encoding and key fields to another encoding).

Regarding the comment from @MandyShaw

When I work with the database with the Sequel Pro Mac app, the console constantly shows pairs of SET NAMES 'utf8' and SET NAMES 'utf8mb4' entries, so that does suggest not everything is set correctly. However, here's what I have currently:

MySQL [(none)]> SHOW GLOBAL VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name            | Value              |
+--------------------------+--------------------+
| character_set_client     | utf8mb4            |
| character_set_connection | utf8mb4            |
| character_set_database   | utf8mb4            |
| character_set_filesystem | binary             |
| character_set_results    | utf8mb4            |
| character_set_server     | utf8mb4            |
| character_set_system     | utf8               |
| collation_connection     | utf8mb4_unicode_ci |
| collation_database       | utf8mb4_unicode_ci |
| collation_server         | utf8mb4_unicode_ci |
+--------------------------+--------------------+

I read that character_set_system can't be changed from utf8 and character_set_filesystem should be binary.

Sequel Pro's connection encoding was set to Autodetect, but when I change it explicitly to utf8mb4, then open a new connection, I still see all those encoding changes in the console.

Is there something else I need to change to use this encoding consistently?


回答1:


utf is really utfmb3 and may use max 3 bytes per character while utfmb4 may use 4 bytes per character. For VARCHAR columns this does not normally much difference since MySQL will store only as many bytes as needed (unless you have created your MyISAM tables with ROW_FORMAT=FIXED).

However, during query execution, MySQL may create temporary tables in the MEMORY storage engine which does not support variable-length rows. These temporary tables have a maximum size, and if that size is exceeded, the temporary tables will be converted to tables in MyISAM/InnoDB (depending on your version of MySQL). The status variable Created_tmp_disk_tables will be incremented each time this happens. If so, try to see if it helps to increase the value of max_heap_table_size and tmp_table_size.

Alternatively, upgrade to MySQL 8.0 where a new storage engine that supports variable-length rows is used for internal temporary tables.



来源:https://stackoverflow.com/questions/50849672/why-did-changing-from-utf8-to-utf8mb4-slow-down-my-database

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!