My query is taking too long to finish for finding the pair of rows where the difference of columns value is maximum group by another column

独自空忆成欢 提交于 2019-12-11 05:37:22

问题


Say, I've a table like this:

I want to find the pair of Centers whose Performance difference is highest for each session, like this:

I have the following query,

select 
    t1.session,
    t1.center center1,
    t2.center center2,
    t1.performance - t2.performance performance
from mytable t1
inner join mytable t2 on t1.session = t2.session
where t1.performance - t2.performance = (
    select max(t11.performance - t22.performance)
    from mytable t11
    inner join mytable t22 on t11.session = t22.session
    where t11.session = t1.session
)

It works but took long time, few minutes for a table of 20 columns and 200 rows. How can I modify the query to achieve the same output faster?


回答1:


select 
        t1.session,
        t1.center center1,
        t2.center center2,
        t1.performance - t2.performance performance
    from mytable t1
    inner join mytable t2 
       on t1.session = t2.session
    WHERE t1.performance = (SELECT MAX(performance) 
                            FROM mytable t3 WHERE t3.session = t1.session)
      AND t2.performance = (SELECT MIN(performance) 
                            FROM mytable t3 WHERE t3.session = t2.session)

     // Im thinking this will solve the border case when performance is a tie 
     // and difference 0 will return 2 rows

     AND (CASE WHEN t1.performance = t2.performance 
               THEN CASE WHEN t1.center < t2.center
                         THEN 1
                         ELSE 0
                    END
               ELSE 1
          END) = 1

As long as you have an index on performance and session should be fine.




回答2:


Use row_number():

select session, center1, center2, performance
from (select t1.center as center1, t2.center as center2,
             (t1.performance - t2.performance) as performance,
             row_number() over (partition by t1.session order by (t1.performance - t2.performance) desc) as seqnum
      from mytable t1 join
           mytable t2
           on t1.session = t2.session
where seqnum = 1;

Or for better performance. The maximum difference is the maximum minus the minimum. You want the centers, here is a method without subqueries:

select session,
       max(case when seqnum_desc = 1 then center end) as center1,
       max(case when seqnum_asc = 1 then center end) as center2,
       max(performance) - min(performance)
from (select t.*,
             row_number() over (partition by session order by performance) as seqnum_asc,
             row_number() over (partition by session order by performance desc) as seqnum_desc
      from mytable t
where 1 in (seqnum_asc, seqnum_desc)
group by session



回答3:


Grouping by session and taking the group's min and max performance seems logical. The actual centers unfortunately need a subquery/join here.

select g.session as Session,
    (select min(center) from mytable
     where session = g.session and performance = g.maxim) as Center1,
    (select min(center) from mytable
     where session = g.session and performance = g.minim) as Center2,
    g.maxim - g.minim as Performance
from (select 
        t1.session,
        min(t1.performance) as minim,
        max(t1.performance) as maxim
    from mytable t1
    group by t1.session)
    as g

Ensure an index on session and performance.




回答4:


select distinct(session) * from (
select  t1.session, t1.center, t2.center, 
(case when t1.performance > t2.performance then (t1.performance-t2.performance) else (t2.performance-t1.performance))as performance_diff 
from mytable t1, mytable t2 
where t1.session=t2.session and t1.center!=t2.center) as T1 order by session,performance_diff desc limit 1;


来源:https://stackoverflow.com/questions/58879412/my-query-is-taking-too-long-to-finish-for-finding-the-pair-of-rows-where-the-dif

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!