VARCHAR vs TEXT performance when data fits on row

丶灬走出姿态 提交于 2019-11-29 13:45:59

问题


mysql> desc temp1;
+-------+--------------+------+-----+---------+-------+
| Field | Type         | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| value | varchar(255) | YES  |     | NULL    |       |
+-------+--------------+------+-----+---------+-------+

mysql> desc temp2;
+-------+------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------+------+-----+---------+-------+
| value | text | YES  |     | NULL    |       |
+-------+------+------+-----+---------+-------+

255 - 'a' characters in each row(In both tables)

mysql> select * from temp1 limit 1;
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| value                                                                                                                                                                                                                                                           |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

mysql> select * from temp2 limit 1;
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| value                                                                                                                                                                                                                                                           |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Query table 1:

select count(*) from temp1 where value like '%a';

Query table 2:

select count(*) from temp2 where value like '%a';

Stats:

No of records---temp1(varchar)---temp2(text)


2097152---------6.08(sec)--------6.91(sec)          
4194304---------12.42(sec)-------13.66(sec)
8388608---------25.08(sec)-------28.03(sec)
16777216--------52.82(sec)-------56.88(sec)
33554432--------1(min)50.17(sec)-1(min)59.36(sec)

My question: How can the difference in execution speed be explained?

The rows contents are same in both tables.

As I understood VarChar and Text columns keep contents offPage only when it exceeds row size. So both tables contents will be inline data for my page size(16kb). Then what was the reason for this query execution time difference.

Note: Both table column is not indexed

Row Format - DYNAMIC

Collation - UTF8mb3

Character set - utf8_general_ci

Storage engine -  innodb

Mysql - 5.7

Reference link: https://stackoverflow.com/a/48301727/5431418

Update: Same flow now I tried with 5000 characters ('a') in both tables the result difference is high.

2097152---------1(min)53.63(sec)--------2(min)4.66(sec)    

Update 2: Same flow now I tried with 2 characters ('a') in both tables still there is a performance difference

Adding table status:

mysql> select * FROM information_schema.tables  WHERE table_schema = "db67006db" and table_name = 'temp1';
+---------------+--------------+------------+------------+--------+---------+------------+------------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------------+
| TABLE_CATALOG | TABLE_SCHEMA | TABLE_NAME | TABLE_TYPE | ENGINE | VERSION | ROW_FORMAT | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH | MAX_DATA_LENGTH | INDEX_LENGTH | DATA_FREE | AUTO_INCREMENT | CREATE_TIME         | UPDATE_TIME | CHECK_TIME | TABLE_COLLATION | CHECKSUM | CREATE_OPTIONS | TABLE_COMMENT |
+---------------+--------------+------------+------------+--------+---------+------------+------------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------------+
| def           | db67006db    | temp1      | BASE TABLE | InnoDB |      10 | Dynamic    |   30625036 |            315 |  9659482112 |               0 |            0 | 425721856 |           NULL | 2019-09-23 20:20:17 | NULL        | NULL       | utf8_general_ci |     NULL |                |               |
+---------------+--------------+------------+------------+--------+---------+------------+------------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------------+
1 row in set (0.01 sec)

mysql> select * FROM information_schema.tables  WHERE table_schema = "db67006db" and table_name = 'temp2';
+---------------+--------------+------------+------------+--------+---------+------------+------------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------------+
| TABLE_CATALOG | TABLE_SCHEMA | TABLE_NAME | TABLE_TYPE | ENGINE | VERSION | ROW_FORMAT | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH | MAX_DATA_LENGTH | INDEX_LENGTH | DATA_FREE | AUTO_INCREMENT | CREATE_TIME         | UPDATE_TIME | CHECK_TIME | TABLE_COLLATION | CHECKSUM | CREATE_OPTIONS | TABLE_COMMENT |
+---------------+--------------+------------+------------+--------+---------+------------+------------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------------+
| def           | db67006db    | temp2      | BASE TABLE | InnoDB |      10 | Dynamic    |   30922268 |            315 |  9753853952 |               0 |            0 | 425721856 |           NULL | 2019-09-23 20:20:12 | NULL        | NULL       | utf8_general_ci |     NULL |                |               |
+---------------+--------------+------------+------------+--------+---------+------------+------------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------------+

回答1:


let's use some tools

Since the initial hunch (see below) was a miss, try running your query via MySQL Workbench in order to gather Query Performance Stats.


initial hunch (no result)

Just a thought:

  • TEXT column size on disk is 2 + N bytes where N is the length of the string
  • VARCHAR takes 1 + N bytes (for N ≤ 255) or 2 + N bytes (for 256 ≤ N ≤ 65535)

Try extending the size of the text in the column above 256 characters and re-run your tests. Potentially they will run with performance more closely matched.

Please also mind that the differences you post are expressed in microseconds per record, so there could be many OS events getting in the way or very simple if (TEXT) {do some additional IO or housekeeping} code path in the source.




回答2:


TEXT type will be always slower than VARCHAR because those types have different methods of storage. VARCHAR field stored in the table with all columns but TEXT stored differently. Each TEXT value is a separate object. It means if you want to do something with TEXT value MySQL will make additional operations to get that object.

Quote from the official documentation:

Each BLOB or TEXT value is represented internally by a separately allocated object. This is in contrast to all other data types, for which storage is allocated once per column when the table is opened.




回答3:


With respect to storage, InnoDB will handle VARCHAR and TEXT much the same when both stored inline. However, when fetching the data from InnoDB, the server will allocate space for all VARCHAR columns before query execution. While space for TEXT columns will only be allocated if they are actually read, where DYNAMIC memory allocation takes time.

https://forums.mysql.com/read.php?24,645115,645164#msg-645164




回答4:


Your first case assumption is not correct. Based on Storage Requirements TEXT stores one more byte for 255 a than VARCHAR so for 33554432 records in your table you need 33554432 more bytes to load in memory and explains the time delay.

That of course would not apply for 5000 a where based on the same documentation the size is the same L + 2 bytes. But i think the reason for that delay is described in Row Size Limits where it writes:

The internal representation of a MySQL table has a maximum row size limit of 65,535 bytes, even if the storage engine is capable of supporting larger rows. BLOB and TEXT columns only contribute 9 to 12 bytes toward the row size limit because their contents are stored separately from the rest of the row.

I think it is quite different to be part of the row data and to be stored separately(need some time to retrieve it from the location stored) and this explains the time delay.



来源:https://stackoverflow.com/questions/58065891/varchar-vs-text-performance-when-data-fits-on-row

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!