Does wildcard in left-most column of composite index mean remaining columns in index aren't used in index lookup (MySQL)?

▼魔方 西西 提交于 2019-12-31 07:27:06

问题


Imagine you have a primary composite index of last_name,first_name. Then you performed a search of WHERE first_name LIKE 'joh%' AND last_name LIKE 'smi%'.

Does the wildcard used in the last_name condition mean that the first_name condition will not be used in further helping MySQL find indexes? In other words, by putting a wildcard on the last_name condition MySQL will only do a partial index lookup (and ignores conditions given in the columns that are to the right of last_name)?

Further clarification of what I'm asking

Example-1: Primary key is last_name, first_name.
Example-2: Primary key is last_name.

Using this WHERE clause:WHERE first_name LIKE 'joh%' AND last_name LIKE 'smi%', would Example-1 be faster than Example-2?

Update

Here is an sqlfiddle: http://sqlfiddle.com/#!9/6e0154/3

CREATE TABLE `people1` (
    `id` INT(11),
    `first_name` VARCHAR(255) NOT NULL,
    `middle_name` VARCHAR(255) NOT NULL,
    `last_name` VARCHAR(255) NOT NULL,
    PRIMARY KEY (`id`),
    INDEX `name` (`last_name`(15), `first_name`(10))
  )
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB;

CREATE TABLE `people2` (
    `id` INT(11),
    `first_name` VARCHAR(255) NOT NULL,
    `middle_name` VARCHAR(255) NOT NULL,
    `last_name` VARCHAR(255) NOT NULL,
    PRIMARY KEY (`id`),
    INDEX `name` (`last_name`(15))
  )
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB;

INSERT INTO `people1` VALUES
(1,'John','','Smith'),(2,'Joe','','Smith'),(3,'Tom','','Smith'),(4,'George','','Washington');
INSERT INTO `people2` VALUES
(1,'John','','Smith'),(2,'Joe','','Smith'),(3,'Tom','','Smith'),(4,'George','','Washington');

# Query 1A
EXPLAIN SELECT * FROM `people1` WHERE `first_name` LIKE 'joh%' AND `last_name` LIKE 'smi%';
# Query 1B
EXPLAIN SELECT * FROM `people1` WHERE `first_name` LIKE 'joh%' AND `last_name` LIKE 'john';

# Query 2A
EXPLAIN SELECT * FROM `people2` WHERE `first_name` LIKE 'joh%' AND `last_name` LIKE 'smi%';
# Query 2B
EXPLAIN SELECT * FROM `people2` WHERE `first_name` LIKE 'joh%' AND `last_name` LIKE 'john';

回答1:


Here are your questions. Plural. By rephrasing them (with "in other words") they are just different questions. Doing so does not make it easier for responders necessarily. On the contrary.

Q1: [Title question] Does wildcard in left-most column of composite index mean remaining columns in index aren't used in index lookup (MySQL)?

A1: No, it does not mean that.


Q2: Does the wildcard used in the last_name condition mean that the first_name condition will not be used in further helping MySQL find indexes?

A2: No, it does not mean that. Plus the tail of that question is ambiguous. It already knows what Index to use could be one offshoot answer to such vagueness.


Q3: In other words, by putting a wildcard on the last_name condition MySQL will only do a partial index lookup (and ignores conditions given in the columns that are to the right of last_name)?

A3: No. The right-most columns are served from the index similar to a covering index strategy benefiting from the slowness of data page lookup.


Q4: ...would Example-1 be faster than Example-2?

A4: Yes. It is a covering index in regards to those columns. See covering indexes.

As an aside concerning Q4. It is irrelevant if it is a PK or non-PK. There are probably a dozen reasons why that as a PK would be dreadful for your application.


Original answer(s) below:

with only a composite key on (last_name,first_name) and a query as you mention

WHERE first_name LIKE 'joh%'

... It won't use the index at all. It will do a table scan. Due to the absence of

  • a single column key on first_name
  • a composite key with first_name left-most

So table scan here we come.

Please see the Manual page Multiple-Column Indexes to read more. And focus on the left-most concept of it. In fact, go to that page, and search on the word left.

See the Manual Page on the Explain facility in mysql. Also the article Using Explain to Write Better Mysql Queries.


Edit

There have been a few edits to the question since I was here an hour or two ago. I will leave you with the following. Run your actual query thru explain, and decipher thru the Using Explain ... link above or another reference

drop table myNames;
create table myNames
(   id int auto_increment primary key,
    lastname varchar(100) not null,
    firstname varchar(100) not null,
    col4 int not null,
    key(lastname,firstname)
);
truncate table myNames;
insert myNames (lastName,firstName,col4) values
('Smith','John',1),('Smithers','JohnSomeone',1),('Smith3','John4324',1),('Smi','Jonathan',1),('Smith123x$FA','Joh',1),('Smi3jfif','jkdid',1),('r3','fe2',1);

insert myNames (lastName,firstName,col4) select lastname,firstname,col4 from mynames;
insert myNames (lastName,firstName,col4) select lastname,firstname,col4 from mynames;
insert myNames (lastName,firstName,col4) select lastname,firstname,col4 from mynames;
insert myNames (lastName,firstName,col4) select lastname,firstname,col4 from mynames;
insert myNames (lastName,firstName,col4) select lastname,firstname,col4 from mynames;
insert myNames (lastName,firstName,col4) select lastname,firstname,col4 from mynames;
insert myNames (lastName,firstName,col4) select lastname,firstname,col4 from mynames;
insert myNames (lastName,firstName,col4) select lastname,firstname,col4 from mynames;
insert myNames (lastName,firstName,col4) select lastname,firstname,col4 from mynames;
insert myNames (lastName,firstName,col4) select lastname,firstname,col4 from mynames;
insert myNames (lastName,firstName,col4) select lastname,firstname,col4 from mynames;
insert myNames (lastName,firstName,col4) select lastname,firstname,col4 from mynames;
insert myNames (lastName,firstName,col4) select lastname,firstname,col4 from mynames;
insert myNames (lastName,firstName,col4) select lastname,firstname,col4 from mynames;
insert myNames (lastName,firstName,col4) select lastname,firstname,col4 from mynames;
insert myNames (lastName,firstName,col4) select lastname,firstname,col4 from mynames;

select count(*) from myNames; 
-- 458k rows

select count(*)
from myNames
where lastname like 'smi%';
-- 393216 rows

select count(*)
from myNames
where lastname like 'smi%' and firstname like 'joh%';
-- 262144 rows

Explain renders voodoo numbers for rows. Voodoo? Yes, because a query that will potentially run for an hour, you are asking explain to give you a fuzzy count, not run it, and give you that answer in 2 seconds or less. Don't consider these to be real count #'s for criteria when it is run for real, without explain.

explain 
select count(*) 
from myNames 
where lastname like 'smi%';
+----+-------------+---------+-------+---------------+----------+---------+------+--------+--------------------------+
| id | select_type | table   | type  | possible_keys | key      | key_len | ref  | rows   | Extra                    |
+----+-------------+---------+-------+---------------+----------+---------+------+--------+--------------------------+
|  1 | SIMPLE      | myNames | range | lastname      | lastname | 302     | NULL | 233627 | Using where; Using index |
+----+-------------+---------+-------+---------------+----------+---------+------+--------+--------------------------+

explain 
select count(*) 
from myNames 
where lastname like 'smi%' and firstname like 'joh%' and col4=1;
+----+-------------+---------+-------+---------------+----------+---------+------+--------+--------------------------+
| id | select_type | table   | type  | possible_keys | key      | key_len | ref  | rows   | Extra                    |
+----+-------------+---------+-------+---------------+----------+---------+------+--------+--------------------------+
|  1 | SIMPLE      | myNames | range | lastname      | lastname | 604     | NULL | 233627 | Using where; Using index |
+----+-------------+---------+-------+---------------+----------+---------+------+--------+--------------------------+


-- the below chunk is interest. Look at the Extra column

explain 
select count(*) 
from myNames 
where lastname like 'smi%' and firstname like 'joh%' and col4=1;
+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table   | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | myNames | ALL  | lastname      | NULL | NULL    | NULL | 457932 | Using where |
+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+

explain 
select count(*) 
from myNames 
where firstname like 'joh%';
+----+-------------+---------+-------+---------------+----------+---------+------+--------+--------------------------+
| id | select_type | table   | type  | possible_keys | key      | key_len | ref  | rows   | Extra                    |
+----+-------------+---------+-------+---------------+----------+---------+------+--------+--------------------------+
|  1 | SIMPLE      | myNames | index | NULL          | lastname | 604     | NULL | 453601 | Using where; Using index |
+----+-------------+---------+-------+---------------+----------+---------+------+--------+--------------------------+


analyze table myNames;
+----------------------+---------+----------+----------+
| Table                | Op      | Msg_type | Msg_text |
+----------------------+---------+----------+----------+
| so_gibberish.mynames | analyze | status   | OK       |
+----------------------+---------+----------+----------+

select count(*) 
from myNames where left(lastname,3)='smi';
-- 393216 -- the REAL #
select count(*) 
from myNames where left(lastname,3)='smi' and left(firstname,3)='joh';
-- 262144 -- the REAL #

explain 
select lastname,firstname 
from myNames  
where lastname like 'smi%' and firstname like 'joh%';
+----+-------------+---------+-------+---------------+----------+---------+------+--------+--------------------------+
| id | select_type | table   | type  | possible_keys | key      | key_len | ref  | rows   | Extra                    |
+----+-------------+---------+-------+---------------+----------+---------+------+--------+--------------------------+
|  1 | SIMPLE      | myNames | range | lastname      | lastname | 604     | NULL | 226800 | Using where; Using index |
+----+-------------+---------+-------+---------------+----------+---------+------+--------+--------------------------+



回答2:


Virtually everything said by @Drew assumes that the index is "covering".

INDEX(last_name, first_name)

is a "covering" index for

SELECT COUNT(*)   FROM t WHERE first_name LIKE 'joh%' AND last_name LIKE 'smi%'.
SELECT last_name  FROM t WHERE first_name LIKE 'joh%' AND last_name LIKE 'smi%'.
SELECT id         FROM t WHERE first_name LIKE 'joh%' AND last_name LIKE 'smi%'. -- if the table is InnoDB and `id` is the `PRIMARY KEY`.

But it is not "covering" for

SELECT foo ...
SELECT foo, last_name ...
etc.

This because foo is not included in the index. For a non-covering situation, the answers are radically different:

Q1: [Title question] Does wildcard in left-most column of composite index mean remaining columns in index aren't used in index lookup (MySQL)?

A1: Yes, it does mean that.

Q2: Does the wildcard used in the last_name condition mean that the first_name condition will not be used in further helping MySQL find indexes?

A2: I'm lost in the vagueness. The optimizer will look at all indexes, not just the one in question. It will pick the 'best'.

Q3: In other words, by putting a wildcard on the last_name condition MySQL will only do a partial index lookup (and ignores conditions given in the columns that are to the right of last_name)?

A3: Yes. This seems to be a dup of Q1.

Q4: ...would Example-1 be faster than Example-2?

A4: No. In extreme situations, INDEX(last_name) will be slower than INDEX(last_name, first_name). Either example will use only the first part (last_name) of the index. However, the composite index is bigger on disk. For a huge table, this may lead to a smaller percentage of it being cached, hence more disk hits, hence slower.




回答3:


I've confirmed that Rick James' answer above is correct. However, Drew and Rick James point out that depending on my SELECT I could use a covering index.

Regarding whether all key parts are used when using a wildcard, the MySQL docs say here:

For a BTREE index, an interval might be usable for conditions combined with AND, where each condition compares a key part with a constant value using =, <=>, IS NULL, >, <, >=, <=, !=, <>, BETWEEN, or LIKE 'pattern' (where 'pattern' does not start with a wildcard). An interval can be used as long as it is possible to determine a single key tuple containing all rows that match the condition (or two intervals if <> or != is used).

The optimizer attempts to use additional key parts to determine the interval as long as the comparison operator is =, <=>, or IS NULL. If the operator is >, <, >=, <=, !=, <>, BETWEEN, or LIKE, the optimizer uses it but considers no more key parts. For the following expression, the optimizer uses = from the first comparison. It also uses >= from the second comparison but considers no further key parts and does not use the third comparison for interval construction:

key_part1 = 'foo' AND key_part2 >= 10 AND key_part3 > 10

The single interval is:

('foo',10,-inf) < (key_part1,key_part2,key_part3) < ('foo',+inf,+inf)

It is possible that the created interval contains more rows than the initial condition. For example, the preceding interval includes the value ('foo', 11, 0), which does not satisfy the original condition.

When using LIKE on a key part of a composite, the key parts to the right are not used. This made we want to go with two separate secondary indexes for last_name and first_name. I would let MySQL judge which one has better cardinality and use it. But in the end, I went with a covering index of last_name,first_name,person_id because I was only going to do a SELECT person_id and this act as a covering key (in addition to searching last_name range). In my tests this proved to be the fastest.



来源:https://stackoverflow.com/questions/33978086/does-wildcard-in-left-most-column-of-composite-index-mean-remaining-columns-in-i

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!