Would having a Composite Index be beneficial for something like this:
SELECT * FROM a INNER JOIN b ON(a.id=b.id)
INNER JOIN c ON(a.bar=c.id)
When JOINing
(that is, INNER JOIN
, not LEFT JOIN
), the Optimizer will pick any of the tables as the "first" table. Then it will use the ON
clause to move on to the next table and do a "NLJ" (Nested Loop Join) to reach into that second table. Etc.
Usually, when there is a WHERE
clause (or certain other things), that will determine which table is "first". Without that, the Optimizer will usually pick the smallest table.
Given a WHERE
clause on one of the tables, it will look for the 'best' INDEX
to use. Such an index is likely to have one or more columns involved in the WHERE
clause. Two indexes will not (with rare exceptions) be used at the same time for a single table.
When moving on to the 'next' table, probably an index on the column(s) in the ON
clause determines the optimal index. Note that if you have ON a.x=b.x AND a.y=b.y
, then the composite INDEX(x,y)
(either order) is desirable.
Back to the case of not deciding on which table to "start" with... The Optimizer will try a variety of orders for the table. With each different ordering, there is a different set of indexes.
For this
FROM a
INNER JOIN b ON(a.id=b.id)
INNER JOIN c ON(a.bar=c.id)
INNER JOIN d ON(a.foo=d.id)
These are optimal, but we can't necessarily predict which will be used:
a: (id) -- if coming from b
a: (bar) -- if coming from c
a: (foo) -- if coming from d
b: (id) -- if coming from b
c: (id) -- if coming from a
d: (id) -- if coming from a
Note: You probably already have PRIMARY KEY(id)
in each of the tables, so there is no need to add INDEX(id)
.
You said SELECT *
, meaning you wanted all the columns from all four tables. If that is not really the case, we can discuss another optimization: "covering indexes".
More discussion: http://mysql.rjweb.org/doc.php/index_cookbook_mysql