I would like to query a relational database if a set of items exists.
The data I am modeling are of the following form:
key1 = [ item1, item3, item5
I won't comment on whether there is a better suited schema for doing this (it's quite possible), but for a schema having columns name and item, the following query should work. (mysql syntax)
SELECT k.name
FROM (SELECT DISTINCT name FROM sets) AS k
INNER JOIN sets i1 ON (k.name = i1.name AND i1.item = 1)
INNER JOIN sets i2 ON (k.name = i2.name AND i2.item = 3)
INNER JOIN sets i3 ON (k.name = i3.name AND i3.item = 5)
LEFT JOIN sets ix ON (k.name = ix.name AND ix.item NOT IN (1, 3, 5))
WHERE ix.name IS NULL;
The idea is that we have all the set keys in k, which we then join with the set item data in sets once for each set item in the set we are searching for, three in this case. Each of the three inner joins with table aliases i1, i2 and i3 filter out all set names that don't contain the item searched for with that join. Finally, we have a left join with sets with table alias ix, which brings in all the extra items in the set, that is, every item we were not searching for. ix.name is NULL in the case that no extra items are found, which is exactly what we want, thus the WHERE clause. The query returns a row containing the set key if the set is found, no rows otherwise.
Edit: The idea behind collapsars answer seems to be much better than mine, so here's a bit shorter version of that with explanation.
SELECT sets.name
FROM sets
LEFT JOIN (
SELECT DISTINCT name
FROM sets
WHERE item NOT IN (1, 3, 5)
) s1
ON (sets.name = s1.name)
WHERE s1.name IS NULL
GROUP BY sets.name
HAVING COUNT(sets.item) = 3;
The idea here is that subquery s1 selects the keys of all sets that contain items other that the ones we are looking for. Thus, when we left join sets with s1, s1.name is NULL when the set only contains items we are searching for. We then group by set key and filter out any sets having the wrong number of items. We are then left with only sets which contain only items we are searching for and are of the correct length. Since sets can only contain an item once, there can only be one set satisfying that criteria, and that's the one we're looking for.
Edit: It just dawned on me how to do this without the exclusion.
SELECT totals.name
FROM (
SELECT name, COUNT(*) count
FROM sets
GROUP BY name
) totals
INNER JOIN (
SELECT name, COUNT(*) count
FROM sets
WHERE item IN (1, 3, 5)
GROUP BY name
) matches
ON (totals.name = matches.name)
WHERE totals.count = 3 AND matches.count = 3;
The first subquery finds the total count of items in each set and the second one finds out the count of matching items in each set. When matches.count is 3, the set has all the items we're looking for, and if totals.count is also 3, the set doesn't have any extra items.