How to optimize MySQL query (group and order)

后端 未结 4 1150
再見小時候
再見小時候 2020-12-11 20:29

Hey all, I\'ve got a query in need of optimizing. It works but its a dog, performance wise.

It reads like this:

SELECT  *
FROM    (
        SELECT  *         


        
相关标签:
4条回答
  • 2020-12-11 20:32

    I would suggest a composite (multi-column) index on user_id, page. That assumes the inner query is the slow part.

    0 讨论(0)
  • 2020-12-11 20:37

    The problem is the subselect. SELECT * FROM (SELECT * FROM)

    You should be using a join. What data type is your 'page' field?

    0 讨论(0)
  • 2020-12-11 20:43

    I'm tracking views to different pages, and I want to know the highest page per session, in order to know how far they've clicked through (they're required to view every page all the way to the end) in any given session.

    Ordering before grouping is a highly unreliable way to do this.

    MySQL extends GROUP BY syntax: you can use ungrouped and unaggregated fields in SELECT and ORDER BY clauses.

    In this case, a random value of page is output per each session.

    Documentation explicitly states that you should never make any assumptions on which value exactly will it be:

    Do not use this feature if the columns you omit from the GROUP BY part are not constant in the group. The server is free to return any value from the group, so the results are indeterminate unless all values are the same.

    However, in practice, the values from the first row scanned are returned.

    Since you are using an ORDER BY page DESC in your subquery, this row happens to be the rows with a maximal page per session.

    You shouldn't rely on it, since this behaviour is undocumented and if some other row will be returned in next version, it will not be considered a bug.

    But you don't even have to do such nasty tricks.

    Just use aggregate functions:

    SELECT  MAX(page)
    FROM    views
    WHERE   user_id = '1'
    GROUP BY
            session
    

    This is documented and clean way to do what you want.

    Create a composite index on (user_id, session, page) for the query to run faster.

    If you need all columns from your table, not only the aggregated ones, use this syntax:

    SELECT  v.*
    FROM    (
            SELECT  DISTINCT user_id, session
            FROM    views
            ) vo
    JOIN    views v
    ON      v.id =
            (
            SELECT  id
            FROM    views vi
            WHERE   vi.user_id = vo.user_id
                    AND vi.session = vo.session
            ORDER BY
                    page DESC
            LIMIT 1
            )
    

    This assumes that id is a PRIMARY KEY on views.

    0 讨论(0)
  • 2020-12-11 20:50

    I think your subquery is unnecessary. You would receive the same results from this much simpler (and faster) query:

    SELECT *
    FROM views 
    WHERE user_id = '1' 
    GROUP BY session
    ORDER BY page DESC
    

    Also, you should have an index on every field you're either grouping, ordering or "where-ing" by. In this case, you need an index on user_id, session and page.

    0 讨论(0)
提交回复
热议问题