问题
I wrote a custom help desk for work and it's been running great... until recently. One query has really slowed down. It takes about 14 seconds now! Here are the relevant tables:
CREATE TABLE `tickets` (
`id` int(11) unsigned NOT NULL DEFAULT '0',
`date_submitted` datetime DEFAULT NULL,
`date_closed` datetime DEFAULT NULL,
`first_name` varchar(50) DEFAULT NULL,
`last_name` varchar(50) DEFAULT NULL,
`email` varchar(50) DEFAULT NULL,
`description` text,
`agent_id` smallint(5) unsigned NOT NULL DEFAULT '1',
`status` smallint(5) unsigned NOT NULL DEFAULT '1',
`priority` tinyint(4) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `date_closed` (`date_closed`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `solutions` (
`id` int(10) unsigned NOT NULL,
`ticket_id` mediumint(8) unsigned DEFAULT NULL,
`date` datetime DEFAULT NULL,
`hours_spent` float DEFAULT NULL,
`agent_id` smallint(5) unsigned DEFAULT NULL,
`body` text,
PRIMARY KEY (`id`),
KEY `ticket_id` (`ticket_id`),
KEY `date` (`date`),
KEY `hours_spent` (`hours_spent`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
When a user submits a ticket, it goes into the "tickets" table. Then, as the agents work through the problem, they record the actions they took. Each entry goes into the "solutions" table. In other words, tickets have many solutions.
The goal of the query that has slowed down is to pull all the fields from the "tickets" table and also the latest entry from the "solutions" table. This is the query I've been using:
SELECT tickets.*,
(SELECT CONCAT_WS(" * ", DATE_FORMAT(solutions.date, "%c/%e/%y"), solutions.hours_spent, CONCAT_WS(": ", solutions.agent_id, solutions.body))
FROM solutions
WHERE solutions.ticket_id = tickets.id
ORDER BY solutions.date DESC, solutions.id DESC
LIMIT 1
) AS latest_solution_entry
FROM tickets
WHERE tickets.date_closed IS NULL
OR tickets.date_closed >= '2012-06-20 00:00:00'
ORDER BY tickets.id DESC
Here is an example of what the "latest_solution_entry" field looks like:
6/20/12 * 1337 * 1: I restarted the computer and that fixed the problem. Yes, I took an hour to do this.
In PHP, I split up the "latest_solution_entry" field and format it correctly.
When I noticed that the page that runs the query had slowed way down, I ran the query without the subquery and it was super fast. I then ran an EXPLAIN on the original query and got this:
+----+--------------------+-----------+-------+---------------+-----------+---------+---------------------+-------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-----------+-------+---------------+-----------+---------+---------------------+-------+-----------------------------+
| 1 | PRIMARY | tickets | index | date_closed | PRIMARY | 4 | NULL | 35804 | Using where |
| 2 | DEPENDENT SUBQUERY | solutions | ref | ticket_id | ticket_id | 4 | helpdesk.tickets.id | 1 | Using where; Using filesort |
+----+--------------------+-----------+-------+---------------+-----------+---------+---------------------+-------+-----------------------------+
So I'm looking for a way to make my query more efficient, yet still achieve the same goal. Any ideas?
回答1:
Let me sum up what I understood: you'd like to select each ticket and its last solution.
I like using the following pattern for this kind of question as it avoids the subquery pattern and is therefore rather good where performance is needed. The drawback is that it is a bit tricky to understand:
SELECT
t.*,
s1.*
FROM tickets t
INNER JOIN solutions s1 ON t.id = s1.ticket_id
LEFT JOIN solutions s2 ON s1.ticket_id = s2.ticket_id AND s2.id > s1.id
WHERE s2.id IS NULL;
I wrote only the heart of the pattern for a better understanding.
The keys are:
the LEFT JOIN of the
solutionstable with itself with thes1.ticket_id = s2.ticket_idcondition: it emulates theGROUP BY ticket_id.the condition
s2.id > s1.id: it is the SQL for "I only want the last solution", it emulates theMAX(). I assumed that in your model,the lastmeanswith the greatest idbut you could use here a condition on the date. Note thats2.id < s1.idwould give you the first solution.the WHERE clause
s2.id IS NULL: the weirdest one but absolutely necessary... keeps only the records you want.
Have a try and let me know :)
Edit 1: I just realised that the second point assumption was oversimplifying the problem. That makes it even more interesting :p I'm trying to see how this pattern may work with your date, id ordering.
Edit 2: Ok, it works great with a little twist. The condition on the LEFT JOIN becomes:
LEFT JOIN solutions s2 ON s1.ticket_id = s2.ticket_id
AND (s2.date > s1.date OR (s2.date = s1.date AND s2.id > s1.id))
回答2:
When you have a inline view in the SELECT clause it must perform that select for every single row. I find its better in cases like this to put an inline view in the FROM clause instead which will perform the select once.
SELECT t.*,
Concat_ws(" * ", Date_format(s.date, "%c/%e/%y"), s.hours_spent,
Concat_ws(":", s.agent_id, s.body))
FROM tickets t
INNER JOIN (SELECT solutions.ticket_id,
Max(solutions.date) maxdate
FROM solutions
GROUP BY solutions.ticket_id) last_solutions
ON t.id = last_solutions.ticket_id
INNER JOIN (SELECT solutions.ticket_id,
solutions.date,
Max(solutions.id) maxid
FROM solutions
GROUP BY solutions.ticket_id,
solutions.date) last_solution
ON last_solutions.ticket_id = last_solution.ticket_id
and last_solutions.maxDate = last_solution.Date
INNER JOIN solutions s
ON last_solution.maxid = s.id
WHERE t.date_closed IS NULL
OR t.date_closed >= '2012-06-20 00:00:00'
ORDER BY t.id DESC
Note: You may need to make it a LEFT join depending on your needs
回答3:
Try this:
SELECT *
FROM (
-- for each ticket get the most recent solution date
SELECT ticket_id, MAX(solutions.date) as date
FROM solutions
GROUP BY ticket_id
) t
JOIN tickets ON t.ticket_id = tickets.id
WHERE tickets.date_closed IS NULL OR tickets.date_closed >= '2012-06-20 00:00:00'
ORDER BY tickets.id DESC
Note, that if there is a ticket with 2 solutions of the same date you will have duplicate records in your result set. You will need another join to remove those duplicates or use an absolute sequence like a serial (incrementing primary key).
回答4:
depending on the purpose, I give a idea :
SELECT DISTINCT s1.ticket_id, t.*, s1.*
FROM tickets t
LEFT JOIN solutions s1 ON t.id = s1.ticket_id
来源:https://stackoverflow.com/questions/11131237/mysql-can-i-do-a-left-join-and-pull-only-one-row-from-the-join-table