I have a cities table which looks like this.
|id| Name |
|1 | Paris |
|2 | London |
|3 | New York|
I have a tags table which looks li
This query is without any fancy functions or even sub queries. It is fast. Just make sure cities.id, cities_tags.id, cities_tags.city_id and cities_tags.tag_id have an index.
The queries returns a result containing: city1, city2 and the count of how many tags city1 and city2 have in common.
select
c1.name as city1
,c2.name as city2
,count(ct2.tag_id) as match_count
from
cities as c1
inner join cities as c2 on
c1.id != c2.id -- change != into > if you dont want duplicates
left join cities_tags as ct1 on -- use inner join to filter cities with no match
ct1.city_id = c1.id
left join cities_tags as ct2 on -- use inner join to filter cities with no match
ct2.city_id = c2.id
and ct1.tag_id = ct2.tag_id
group by
c1.id
,c2.id
order by
c1.id
,match_count desc
,c2.id
Change != into > to avoid each city to be returned twice. Meaning a city will then no longer appears once in the first column as well as once in the second column.
Change the two left join into inner join if you don't want to see the city combinations that have no tag matches.