how to prevent duplicates with inner join query (Postgres)

一世执手 提交于 2019-12-09 11:33:00

问题


I am trying to understand how to create a query to filter out some results based on an inner join.

Consider the following data:

formulation_batch
-----
id  project_id  name    
1   1           F1.1
2   1           F1.2
3   1           F1.3
4   1           F1.all

formulation_batch_component
-----
id  formulation_batch_id    component_id
1   1                       1
2   2                       2
3   3                       3
4   4                       1
5   4                       2
6   4                       3
7   4                       4

I would like to select all formulation_batch records with a project_id of 1, and has a formulation_batch_component with a component_id of 1 or 2. So I run the following query:

SELECT "formulation_batch".* 
FROM "formulation_batch" 
INNER JOIN "formulation_batch_component" 
ON "formulation_batch"."id" = "formulation_batch_component"."formulationBatch_id" 
WHERE "formulation_batch"."project_id" = 1 
    AND (("formulation_batch_component"."component_id" = 2 
        OR "formulation_batch_component"."component_id" = 1 ))

However, this returns a duplicate entry:

1;"F1.1"
2;"F1.2"
4;"F1.all"
4;"F1.all"

Is there a way to modify this query so that I only get back the unique formulation_batch records which match the criteria?

EG:

1;"F1.1"
2;"F1.2"
4;"F1.all"

Thanks for your time!


回答1:


One way would be to use distinct:

SELECT distinct "formulation_batch".* 
FROM "formulation_batch" 
INNER JOIN "formulation_batch_component" 
ON "formulation_batch"."id" = "formulation_batch_component"."formulationBatch_id" 
WHERE "formulation_batch"."project_id" = 1 
    AND (("formulation_batch_component"."component_id" = 2 
        OR "formulation_batch_component"."component_id" = 1 ))



回答2:


In this case it is possible to apply the distinct before the join possibly making it more performant:

select fb.* 
from
    formulation_batch fb
    inner join
    (
        select distinct formulationbatch_id
        from formulation_batch_component
        where component_id in (1, 2)
    ) fbc on fb.id = fbc.formulationbatch_id 
where fb.project_id = 1

Notice how to use alias for the table names to make the query clearer. Also then in operator is very handy. The use of double quotes with those identifiers is not necessary.




回答3:


I know the question asks how to prevent duplicates with inner join but could use an IN clause in the predicate.

SELECT "formulation_batch".* 
FROM "formulation_batch" fb
ON "formulation_batch"."id" = "formulation_batch_component"."formulationBatch_id" 
WHERE "formulation_batch"."project_id" = 1 
 AND fb.id IN (SELECT "formulation_batch"."id"
               FROM formulation_batch_component
               WHERE (("formulation_batch_component"."component_id" = 2 
                      OR "formulation_batch_component"."component_id" = 1 ))


来源:https://stackoverflow.com/questions/17959279/how-to-prevent-duplicates-with-inner-join-query-postgres

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!