SQL: Select latest thread and latest post, grouped by forum, ordered by latest post

二次信任 提交于 2019-12-07 14:18:34

问题


I'm trying to get the

  • latest thread (id, topic, timestamp, author_id) and
  • latest post (id, thread_id, timestamp, author_id)
  • of each forum (id, name)
  • ordered by the latest post, indepent from the thread's creationdate.

Why?

I'd like to be able to display details like:

"The latest Answer of forum $forum_id was given on Question $thread_id. Here it is: $post_id"

SELECT  f.id AS forum_id,
        f.name AS forum_name,
        t.id AS thread_id,
        t.topic AS thread_topic,
        t.ts AS thread_timestamp,
        p.id AS post_id,
        p.content AS post_content,
        p.ts AS post_timestamp

 FROM   forums f,
        threads t,
        posts p

WHERE   f.id = t.forum_id 
  AND   t.id = p.thread_id

GROUP BY f.id
ORDER BY p.ts

Any advices, how to change the SQL to get the wanted result as much performant as possible? I'm trying to avoid subqueries but I'm open-minded!

Thanks in advance!


回答1:


Since MySQL doesn't support window functions, I don't think there's any way to do this without a subquery:

SELECT  f.id AS forum_id,
    f.name AS forum_name,
    t.id AS thread_id,
    t.topic AS thread_topic,
    t.ts AS thread_timestamp,
    p.id AS post_id,
    p.content AS post_content,
    p.ts AS post_timestamp

FROM   forums f
JOIN (SELECT t2.forum_id, max(p2.ts) as ts
      FROM posts p2
      JOIN threads t2 ON p2.thread_id = t2.id
      GROUP BY t2.forum_id) max_p ON f.id = max_p.forum_id
JOIN   posts p ON max_p.ts = p.ts
JOIN   threads t ON f.id = t.forum_id AND p.thread_id = t.id
ORDER BY p.ts

Naturally, caching the latest results would let you do this without the performance penalty of calling MAX(), but with the right indices, this shouldn't be much of an issue...

UPDATE

The most concise way of including the threads without posts and forums without threads would be to use LEFT JOINs instead of an INNER JOINs:

SELECT  f.id AS forum_id,
    f.name AS forum_name,
    t.id AS thread_id,
    t.topic AS thread_topic,
    t.ts AS thread_timestamp,
    p.id AS post_id,
    p.content AS post_content,
    p.ts AS post_timestamp

FROM   forums f
LEFT JOIN (SELECT t2.forum_id, max(COALESCE(p2.ts, t2.ts)) as ts, COUNT(p2.ts) as post_count
      FROM threads t2 
      LEFT JOIN posts p2 ON p2.thread_id = t2.id
      GROUP BY t2.forum_id) max_p ON f.id = max_p.forum_id
LEFT JOIN   posts p ON max_p.ts = p.ts
LEFT JOIN   threads t ON f.id = t.forum_id AND (max_p.post_count = 0 OR p.thread_id = t.id)
ORDER BY p.ts



回答2:


I can think of two "proper" ways of doing this. The first is using joins and subqueries:

SELECT  f.id AS forum_id,
        f.name AS forum_name,
        t.id AS thread_id,
        t.topic AS thread_topic,
        t.ts AS thread_timestamp,
        p.id AS post_id,
        p.content AS post_content,
        p.ts AS post_timestamp
 FROM   forums f join
        threads t
        on f.id = t.forum_id join
        posts p
        on t.id = p.thread_id
WHERE   t.ts = (select ts from threads t2 where t2.forum_id = t.forum_id order by ts desc limit 1) and
        p.ts = (select ts from posts p2 where p2.thread_id = p.thread_id order by ts desc limit 1)
GROUP BY f.id
ORDER BY max(p.ts)

The problem with this approach is that this returns the most recent thread and the most recent post on that thread. Fixing this is cumbersome (and that might be what you really want.)

The subqueries get the latest date on for threads and posts. Performance depends on the indexes that you have. It might be acceptable. This is standard SQL.

The other is a trick with substring_index()/group_concat(), which is specific to MySQL:

SELECT  f.id AS forum_id,
        f.name AS forum_name,
        substring_index(group_concat(t.id order by t.ts desc separator '|'), '|', 1) AS thread_id,
        substring_index(group_concat(t.topic order by t.ts desc separator '|'), '|', 1)  AS thread_topic,
        substring_index(group_concat(t.ts order by p.ts desc separator '|'), '|', 1)  AS thread_timestamp,
        substring_index(group_concat(p.id order by p.ts desc separator '|'), '|', 1)  AS post_id,
        substring_index(group_concat(p.content order by p.ts desc separator '|'), '|', 1)  AS post_content,
        substring_index(group_concat(p.ts order by p.ts desc separator '|'), '|', 1)  AS post_timestamp
 FROM   forums f join
        threads t
        on f.id = t.forum_id join
        posts p
        on t.id = p.thread_id
GROUP BY f.id
ORDER BY max(p.ts);

This version might perform better (because you are already incurring the overhead of a group by). The separator character has to be chosen so it is not in any of the values. Otherwise, only the portion before the separator will appear.

One advantage is that the threads and posts are treated independently, so you get the most recent thread and, separately, the most recent post. You can get the most recent post on a given thread by changing the order by conditions in the group_concat().

Also, to get the ordering you want, you need to order by max(p.ts) rather than just p.ts. The latter would order by an arbitrary time stamp on the forum; there is no guarantee it would be the most recent one.



来源:https://stackoverflow.com/questions/17222233/sql-select-latest-thread-and-latest-post-grouped-by-forum-ordered-by-latest-p

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!