How to exclude rows when using a LEFT JOIN (MySQL)

心已入冬 提交于 2021-02-10 08:06:46

问题


I have users with many posts. I want to build an SQL query that would do the following in 1 query (no subquery), and hopefully no unions if possible. I know I can do this with union but I want to learn if this can be done using only joins.

I want to get a list of distinct active users who:

  1. have no posts
  2. have no approved posts

Here's what I have so far:

SELECT DISTINCT u.*
FROM users u
  LEFT JOIN posts p
    ON p.user_id = u.id
  LEFT JOIN posts p2
    ON p2.user_id = u.id
WHERE u.status = 'active'
  AND (p.status IS NULL
  OR p2.status != 'approved');

The problem is when a user has multiple posts and one is active. This will still return the user which I do not want. If a user has an active post, he should be removed from the result set. Any ideas?

Here's what the data looks like:

mysql> select * from users;
+----+---------+
| id | status  |
+----+---------+
|  1 | active  |
|  2 | pending |
|  3 | pending |
|  4 | active  |
|  5 | active  |
+----+---------+
5 rows in set (0.00 sec)

mysql> select * from posts;
+----+---------+----------+
| id | user_id | status   |
+----+---------+----------+
|  1 |       1 | approved |
|  2 |       1 | pending  |
|  3 |       4 | pending  |
+----+---------+----------+
3 rows in set (0.00 sec)

The answer here should be only users 4 and 5. 4 doesn't have an approved post and 5 doesn't have a post. It should not include 1, which has an approved post.


回答1:


Taking your requirements and translating them literally to SQL, I get this:

SELECT users.id,
       COUNT(posts.id) as posts_count,
       COUNT(approved_posts.id) as approved_posts_count
FROM users
LEFT JOIN posts ON posts.user_id = users.id
LEFT JOIN posts approved_posts
  ON approved_posts.status = 'approved'
  AND approved_posts.user_id = users.id
WHERE users.status = "active"
GROUP BY users.id
HAVING (posts_count = 0 OR approved_posts_count = 0);

For your test data above, this returns:

4|1|0
5|0|0

i.e. users with ids 4 and 5, the first of which has 1 post but no approved posts and the second of which has no posts.

However, it seems to me that this can be simplified since any user that has no approved posts will also have no posts, so the union of conditions is unnecessary.

In that case, the SQL is simply:

SELECT users.id,
       COUNT(approved_posts.id) as approved_posts_count
FROM users
LEFT JOIN posts approved_posts
  ON approved_posts.status = 'approved'
  AND approved_posts.user_id = users.id
WHERE users.status = "active"
GROUP BY users.id
HAVING approved_posts_count = 0;

This also returns the same two users. Am I missing something?




回答2:


Not exists:

SELECT u.*
FROM users u
WHERE NOT EXISTS (
   SELECT 1 
   FROM posts p
   WHERE p.user_id = u.id AND p.status = 'approved');

Or equivalent LEFT JOIN

SELECT u.*
FROM users u
LEFT JOIN posts p
   ON p.user_id = u.id AND p.status = 'approved'
WHERE p.user_id IS NULL;



回答3:


Please explain why you don't want JOINs or UNIONs. If it is because of performance, then consider the following:

CREATE TABLE t ( PRIMARY KEY(user_id) )
    SELECT user_id, MIN(status) AS z
    FROM Posts
    GROUP BY user_id;

SELECT  u.id AS user,
        IFNULL(z, 'no_posts') AS status
    FROM users u
    WHERE u.status = 'active'
    LEFT JOIN t ON t.user_id = u.id
    HAVING status != 'approved';

It will make only one pass over each table, thereby being reasonably efficient (considering the complexity of the query).




回答4:


This one may help:

SELECT DISTINCT u.*
FROM users u
LEFT JOIN posts p ON 1=1
  -- matches only if user has any post
  AND p.user_id = u.id 
  -- matches only if user has any active post
  AND p.status = 'approved'
WHERE 1=1
  -- matches only active users
  AND u.status = 'active'
  -- matches only users with no matches on the LEFT JOIN
  AND p.status IS NULL
;



回答5:


I think this should be easy.

SELECT u.`id`, u.`status` FROM `users` u
LEFT OUTER JOIN `post` p ON p.`user_id` = u.`id` AND p.`status` = 'approved'
WHERE u.`status` = 'active' AND p.`id` IS NULL

Gives a result of 4 & 5.

[Edit] Just wanted to add why this works:

u.status = 'active'

This results into exclusion of all users that are not active.

p.status = 'approved'

This excludes all posts that are approved.

Hence, by using these two lines, we have excluded all users that qualify as approved for your criteria.

[Edit 2]

If you also need to know how many pending and how many approved, here is an updated version:

SELECT u.`id`, u.`status`, SUM(IF(p.`status` = 'approved', 1, 0)) AS `Approved_Posts`, SUM(IF(p.`status` = 'pending', 1, 0)) AS `Pending_Posts`
FROM `test_users` u
LEFT OUTER JOIN `test_post` p ON p.`user_id` = u.`id`
WHERE u.`status` = 'active' 
GROUP BY u.`id`
HAVING SUM(IF(p.`id` IS NOT NULL, 1, 0))



回答6:


Try this

SELECT DISTINCT u.*
FROM users u LEFT JOIN posts p
    ON p.user_id = u.id
WHERE p.status IS NULL 
  OR p.status != 'approved';



回答7:


Can you try with the below query:

 SELECT DISTINCT u.*
    FROM users u
      LEFT JOIN posts p
        ON p.user_id = u.id
    WHERE 
        u.status = 'active' AND (
        p.user_id IS NULL
        OR p.status != 'approved');

EDIT

As per the updated question, the above query will include User 1. If we want to prevent that, and don't want to use inner query, we can use group_concat function of MySQL to get all the (distinct) statuses and see if it contains 'active' status, below query should give the desired output:

SELECT u.id, group_concat(distinct p.status) as statuses
    FROM users u
      LEFT JOIN posts p
        ON u.id = p.user_id
    WHERE 
        u.status = 'active'
group by u.id
having (statuses is null or statuses not like '%approved%');


来源:https://stackoverflow.com/questions/39588060/how-to-exclude-rows-when-using-a-left-join-mysql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!