Optimal Oracle SQL Query to complete group-by on multiple columns in single table containing ~ 7,000,000 records

百般思念 提交于 2019-12-11 17:35:50

问题


I am a SQL Novice in need of some advice. What is the most efficient (fastest running query) way to do the following-

Select all columns from a table after-

-Performing a "Group By" based on the unique values contained in two columns: "top_line_id" and "external_reference".

-Selecting a single record from each group based on the max or min value (doesn't matter which one) contained in a different field such as support_id.

Someone on my team provided the below query, but I can't seem to get it working. I receive an error message stating "invalid relational operator" when I attempt to execute it.

Select * 
from STAGE.SFS_GH_R3_IB_ENTLMNT_CONTACTS
Where support_id, external_reference, top_line_id in (
         select max(support_id), 
                external_reference, 
                top_line_id from STAGE.SFS_GH_R3_IB_ENTLMNT_CONTACTS
)

One more thing - the columns on which we are performing the Group By contain null values in some records. We would like those excluded from the query.

Any assistance you can provide is very much appreciated.


回答1:


Although you phrase this as a group by query, there is another approach using row_number(). This enumerates each row in the group, based on the "order by" clause. In the following query, it enumerates each group based on external_reference and top_line_id, ordered by support_id:

select *
from (Select t.*,
             row_number() over (partition by external_reference, top_line_id
                                order by support_id) as seqnum
      from STAGE.SFS_GH_R3_IB_ENTLMNT_CONTACTS t
     )
where seqnum = 1



回答2:


This should work(can't test it)


SELECT
  *
FROM
  stage.sfs_gh_r3_ib_entlmnt_contacts
WHERE
  (support_id, external_reference, top_line_id) IN
    (
      SELECT
        max(support_id), 
        external_reference, 
        top_line_id
      FROM
        stage.sfs_gh_r3_ib_entlmnt_contacts
      WHERE
        external_reference IS NOT NULL AND
        top_line_id IS NOT NULL
      GROUP BY
        top_line_id, external_reference
    )


来源:https://stackoverflow.com/questions/10670766/optimal-oracle-sql-query-to-complete-group-by-on-multiple-columns-in-single-tabl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!