Removing duplicates from multiple self left joins

久未见 提交于 2019-12-04 11:54:21

I'd suggest rather than joining on !=, try joining on <=.

You will then have all combinations with t1.id > t2.id, t2.id > t3.id, and so on.

Rows will not be 'duplicates' because they are ordered sets, and any set containing equivalent members would necessarily result in the identical ordered set.

I think you mean you want to go from a permutation of rows to a combination rows?

If so, the select distinct answers are wrong. Select distinct will select distinct permutations. I think you have a pretty good way of doing it. The only thing I can think of, would be to concatenate the rules into a string and the sort it in place. It looks like you are using Postgresql and there is no function that does it in the built-in string functions.

If the amount of symbols were small you might be able to insert them into the array pre-sorted by inserting 'A' in index 1, 'B' into index 2, etc. Which might might the sort quicker ...

You need to get an order into your results in order to filter all duplicates out. This can be achieved by making sure that a<b<c. And once you have an order in your results, you can apply a distinct to the resultset.

` SELECT count(*) FROM rules AS t1

LEFT JOIN rules AS t2 ON t1.id != t2.id AND

LEFT JOIN rules AS t3 ON t1.id != t2.id AND t1.id != t3.id AND t2.id != t3.id ...

t1.id < t2.id and t2.id < t3.id ...

AND ...`

Difficult to understand exactly what you're trying to achieve, but to avoid the A-B-C C-B-A duplication, try this:

SELECT count(*) 
FROM rules AS t1 
LEFT JOIN rules AS t2
 ON t1.id **<** t2.id
 AND ...
LEFT JOIN rules AS t3
 ON t1.id **<** t2.id AND t1.id **<** t3.id AND t2.id **<** t3.id
 AND ...

That way, the answers are always ordered

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!