问题
I have a post/tag database, with the usual post, tag, and tag_post tables. The tag_post table contains tagid and postid fields.
I need to query posts. When I want to fetch posts that have a certain tag, I have to use a join:
... INNER JOIN tag_post ON post.id = tag_post.postid
WHERE tag_post.tagid = {required_tagid}`
When I want to fetch posts that have tagIdA and tagIdB, I have to use two joins (which I kind of came to terms with eventually).
Now, I need to query posts that do not have a certain tag. Without much thought, I just changed the = to !=:
... INNER JOIN tag_post ON post.id = tag_post.postid
WHERE tag_post.tagid != {certain_tagid}`
Boom! Wrong logic!
I did come up with this - just writing the logic here:
... INNER JOIN tag_post ON post.id = tag_post.postid
WHERE tag_post.postid NOT IN
(SELECT postid from tag_post where tagid = {certain_tagid})
I know this will work, but due to the way I've been brought up, I feel guilty (justified or not) whenever I write a query with a subquery.
Suggest a better way to do this?
回答1:
You can think of it as "find all rows in posts that do not have a match in tags (for a specific tag)"
This is the textbook use case for a LEFT JOIN.
LEFT JOIN tag_post ON post.id = tag_post.postid AND tag_post.tagid = {required_tagid}
WHERE tag_post.tag_id IS NULL
Note that you have to have the tag id in the ON clause of the join.
For a reference on join types, see here: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
回答2:
In addition to Gavin Towey's good answer, you can use a not exists subquery:
where not exists
(
select *
from tag_post
where post.id = tag_post.postid
and tag_post.tagid = {required_tagid}
)
The database typically executes both variants in the same way. I personally find the not exists approach easier to read.
回答3:
When I want to fetch posts that have tagIdA and tagIdB, I have to use two joins (which I kind of came to terms with eventually).
There are other ways.
One can obtain all the
idof all posts that are tagged with bothtagid123 and 456 by grouping filteringtag_postfor only those tags, grouping by post and then dropping any groups that contain fewer tags than expected; then one can use the result to filter thepoststable:SELECT * FROM posts WHERE id IN ( SELECT postid FROM tag_post WHERE tagid IN (123,456) GROUP BY postid HAVING COUNT(*) = 2 )If a post can be tagged with the same
tagidmultiple times, you will need to replaceCOUNT(*)with the less performantCOUNT(DISTINCT tagid).Now, I need to query posts that do not have a certain tag.
This is known as an anti-join. The easiest way is to replace
INfrom the query above withNOT IN, as you proposed. I wouldn't feel too guilty about it. The alternative is to use an outer join, as proposed in @GavinTowey's answer.
来源:https://stackoverflow.com/questions/12439820/select-a-post-that-does-not-have-a-particular-tag