Left and Inner Join difference… once forever

折月煮酒 提交于 2019-12-22 08:26:21

问题


I know that many threads has been created here & on the internet about this topic. But I really can't get the final point on the difference between the two statements! I mean, trying and trying I can reach all the results I need with my queries, but I really don't have full control of the knife!

I'm considering myself a very good programmer and a very good SQL-ista and I feel a little ashamed about this...

Here's an example:

  • I have a table with the pages of a website ("web_page")
  • a table with the categories ("category").
  • a category can contain one or more pages, but not vice versa
  • a category may contain NO pages at all
  • a page can be visible or not in the website

So if I want to show all the categories and their pages, I mean both categories with pages and without, I have to do something like this:

FROM category
LEFT JOIN web_page ON ( web_page.category_id = category.category_id AND web_page.active = "Y" )

So if a category has no pages, I'll see web_page_id NULL on the record of that category.

But if I do:

FROM category
LEFT JOIN web_page ON ( web_page.category_id = category.category_id )
...
WHERE web_page.active = "Y"...

I'll select only the categories that have at least one web_page... But WHY?

This was just an example... I'd like to understand once forever this difference!

Thank you.


回答1:


To make your query to work as you intended, put the condition into the ON clause:

FROM category
LEFT JOIN web_page ON web_page.category_id = category 
   and web_page.active = "Y"

The reason this works is (with most databases, but not all) the WHERE clause filters the rows After they are joined. If the join doesn't result in a web page row joining (because the category had no web pages), then all the columns of web page will be null, and any comparison of a value (like "Y") to a null is false, so those non-joining rows will be filtered out.

However, by moving the condition into the ON clause, the condition is executed as the join is made, so that you only join rows that are active = "Y", but if there aren't any such rows, you'll just get the left join null web page.

This version of the query is really saying: "give me all categories and their active web pages (if any)"

Note that I said "most databases"... mysql for example is smart enough to understand what you are trying to do, and your query will work as you intended if run on mysql.




回答2:


This happens because SQL is being processed in stages:

  1. FROM clause (all joins are here);
  2. WHERE clause;
  3. GROUP BY clause;
  4. Window functions (not related to the MySQL though);
  5. ORDER BY clause.

So this really is important, if you want to filter on web_page.active='Y' or you want to join with the same condition. In the former case, join is done and you just filter out the results, converting you OUTER join into the INNER one. In the latter case, you will achieve the desired result, as non-matching rows will result in NULL values for the corresponding columns.




回答3:


Trying to help you on the understanding part.

Consider is a facet of the SQL Language that when the criteria is specified within the LEFT JOIN, it applies when finding the records to match against. When the criteria is specified in the WHERE clause at the bottom, it applies to all of the records - after the join has occured. This has an unintended side effect of changing the LEFT JOIN to an INNER JOIN as you have seen.

You can get around this by doing a WHERE clause like this:

WHERE COALESCE(web_page.active,"Y") = "Y"

But that isn't actaully guarenteed to be the same results, so the proper way to do it is to keep that criteria in the ON clauses of the JOIN.



来源:https://stackoverflow.com/questions/11329683/left-and-inner-join-difference-once-forever

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!