问题
I'm trying to get the
- latest thread (id, topic, timestamp, author_id) and
- latest post (id, thread_id, timestamp, author_id)
- of each forum (id, name)
- ordered by the latest post, indepent from the thread's creationdate.
Why?
I'd like to be able to display details like:
"The latest Answer of forum $forum_id was given on Question $thread_id. Here it is: $post_id"
SELECT f.id AS forum_id,
f.name AS forum_name,
t.id AS thread_id,
t.topic AS thread_topic,
t.ts AS thread_timestamp,
p.id AS post_id,
p.content AS post_content,
p.ts AS post_timestamp
FROM forums f,
threads t,
posts p
WHERE f.id = t.forum_id
AND t.id = p.thread_id
GROUP BY f.id
ORDER BY p.ts
Any advices, how to change the SQL to get the wanted result as much performant as possible? I'm trying to avoid subqueries but I'm open-minded!
Thanks in advance!
回答1:
Since MySQL doesn't support window functions, I don't think there's any way to do this without a subquery:
SELECT f.id AS forum_id,
f.name AS forum_name,
t.id AS thread_id,
t.topic AS thread_topic,
t.ts AS thread_timestamp,
p.id AS post_id,
p.content AS post_content,
p.ts AS post_timestamp
FROM forums f
JOIN (SELECT t2.forum_id, max(p2.ts) as ts
FROM posts p2
JOIN threads t2 ON p2.thread_id = t2.id
GROUP BY t2.forum_id) max_p ON f.id = max_p.forum_id
JOIN posts p ON max_p.ts = p.ts
JOIN threads t ON f.id = t.forum_id AND p.thread_id = t.id
ORDER BY p.ts
Naturally, caching the latest results would let you do this without the performance penalty of calling MAX(), but with the right indices, this shouldn't be much of an issue...
UPDATE
The most concise way of including the threads without posts and forums without threads would be to use LEFT JOINs instead of an INNER JOINs:
SELECT f.id AS forum_id,
f.name AS forum_name,
t.id AS thread_id,
t.topic AS thread_topic,
t.ts AS thread_timestamp,
p.id AS post_id,
p.content AS post_content,
p.ts AS post_timestamp
FROM forums f
LEFT JOIN (SELECT t2.forum_id, max(COALESCE(p2.ts, t2.ts)) as ts, COUNT(p2.ts) as post_count
FROM threads t2
LEFT JOIN posts p2 ON p2.thread_id = t2.id
GROUP BY t2.forum_id) max_p ON f.id = max_p.forum_id
LEFT JOIN posts p ON max_p.ts = p.ts
LEFT JOIN threads t ON f.id = t.forum_id AND (max_p.post_count = 0 OR p.thread_id = t.id)
ORDER BY p.ts
回答2:
I can think of two "proper" ways of doing this. The first is using joins and subqueries:
SELECT f.id AS forum_id,
f.name AS forum_name,
t.id AS thread_id,
t.topic AS thread_topic,
t.ts AS thread_timestamp,
p.id AS post_id,
p.content AS post_content,
p.ts AS post_timestamp
FROM forums f join
threads t
on f.id = t.forum_id join
posts p
on t.id = p.thread_id
WHERE t.ts = (select ts from threads t2 where t2.forum_id = t.forum_id order by ts desc limit 1) and
p.ts = (select ts from posts p2 where p2.thread_id = p.thread_id order by ts desc limit 1)
GROUP BY f.id
ORDER BY max(p.ts)
The problem with this approach is that this returns the most recent thread and the most recent post on that thread. Fixing this is cumbersome (and that might be what you really want.)
The subqueries get the latest date on for threads
and posts
. Performance depends on the indexes that you have. It might be acceptable. This is standard SQL.
The other is a trick with substring_index()
/group_concat()
, which is specific to MySQL:
SELECT f.id AS forum_id,
f.name AS forum_name,
substring_index(group_concat(t.id order by t.ts desc separator '|'), '|', 1) AS thread_id,
substring_index(group_concat(t.topic order by t.ts desc separator '|'), '|', 1) AS thread_topic,
substring_index(group_concat(t.ts order by p.ts desc separator '|'), '|', 1) AS thread_timestamp,
substring_index(group_concat(p.id order by p.ts desc separator '|'), '|', 1) AS post_id,
substring_index(group_concat(p.content order by p.ts desc separator '|'), '|', 1) AS post_content,
substring_index(group_concat(p.ts order by p.ts desc separator '|'), '|', 1) AS post_timestamp
FROM forums f join
threads t
on f.id = t.forum_id join
posts p
on t.id = p.thread_id
GROUP BY f.id
ORDER BY max(p.ts);
This version might perform better (because you are already incurring the overhead of a group by
). The separator character has to be chosen so it is not in any of the values. Otherwise, only the portion before the separator will appear.
One advantage is that the threads and posts are treated independently, so you get the most recent thread and, separately, the most recent post. You can get the most recent post on a given thread by changing the order by
conditions in the group_concat()
.
Also, to get the ordering you want, you need to order by max(p.ts)
rather than just p.ts
. The latter would order by an arbitrary time stamp on the forum; there is no guarantee it would be the most recent one.
来源:https://stackoverflow.com/questions/17222233/sql-select-latest-thread-and-latest-post-grouped-by-forum-ordered-by-latest-p