问题
I’m using Postgres 9.5. I have the below query that is designed to find identical rows of data (but unique IDs) in my table.
select e.name,
e.day,
e.distance,
e.created_at,
e2.created_at
from events e,
events e2
where e.name = e2.name
and e.distance = e2.distance
and e.day = e2.day
and e.web_crawler_id = e2.web_crawler_id
and e.id <> e2.id
and e.web_crawler_id = 1
order by e.day desc;
I ultimately want to delete one of the duplicate rows — so perhaps deleting the row with the greatest “created_at” date. But I’m unsure how to write a query to only return one of the two identical rows. How do I do that?
回答1:
There are many ways but without changing your sql much you can just do greater than instead of <> for id:
select e.name, e.day, e.distance, e.created_at, e2.created_at
from events e, events e2
where e.name = e2.name
and e.distance = e2.distance
and e.day = e2.day
and e.web_crawler_id = e2.web_crawler_id
and e.id > e2.id
and e.web_crawler_id = 1
回答2:
You can use LIMIT:
select e.name, e.day, e.distance, e.created_at, e2.created_at from events e, events e2 where e.name = e2.name and e.distance = e2.distance and e.day = e2.day and e.web_crawler_id = e2.web_crawler_id and e.id <> e2.id and e.web_crawler_id = 1 order by e.day desc LIMIT 1;
来源:https://stackoverflow.com/questions/40431252/how-do-i-delete-one-of-my-two-duplicate-rows-of-data-in-postgres