问题
using the SQL query
select u.name,count(u.name) as 'followers'
from user u,follow f
where u.type = 'c' AND f.followee = u.email
group by u.name
gets me the correct value for all users in my database, however, the exact same query without the group by line only gives me the first value. I am learning SQL for the first time and was having a hard time figuring out why this is.
回答1:
When you use count without group by
it will count all the records and returns single line
while when you use count with group by
it will group the users
on the base of their names and returns the count of each group
.
回答2:
the exact same query without the
group by
line only gives me the first value.
Not quite.
The query without group by
looks like this:
select u.name, count(u.name) as 'followers'
from user u, follow f
where u.type = 'c' AND f.followee = u.email
The query uses COUNT()
that is a GROUP BY aggregate function. These functions require the presence of a GROUP BY
clause in the query. However, the SQL standard is tolerant and accepts you query and creates a single group from all the rows filtered by the WHERE
clause.
On the other side, your query without the GROUP BY
clause is invalid.
This is how the GROUP BY
queries work:
- the rows filtered by the
WHERE
clause are grouped; all the rows from a group have the same value for the first expression present in theGROUP BY
clause; - if the
GROUP BY
clause contains two or more expressions, each group created on the first step is split into sub-groups using the second expression from theGROUP BY
clause; - repeat step 2 for each subsequent expression from the
GROUP BY
clause, creating nested sub-groups; - one single row is computed from each group created on the previous step; the values of this row are computed using only the values of the rows contained in the group;
If a column or an expression from the SELECT
clause does not use a GROUP BY
aggregate function and is not present in the GROUP BY
clause then some groups may contain rows having different values for that column/expression; this is an error.
In order to avoid this to happen, the SQL standard allows in the SELECT
clause only expressions that satisfy one of these conditions:
- the expression also appears in the
GROUP BY
clause; - the expression is computed using a GROUP BY aggregate function;
- all the columns used by the expression are functionally dependent on the columns that appear in the
GROUP BY
clause.
Let's analyze the expressions in the SELECT
clause of your query:
u.name
- on the initial query it satisfies condition #1; on the query withoutGROUP BY
it doesn't satisfy any condition. This makes the query invalid SQL.count(u.name)
- it satisfies condition #2 on both versions of the query; it doesn't make problems.
Even if the version of the query without GROUP BY
is not valid SQL, up to version 5.7.5, MySQL allows it but it reserves itself the freedom to return indeterminate values for the invalid expressions (u.name
).
A quote from the documentation:
In this case, the server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate, which is probably not what you want. Furthermore, the selection of values from each group cannot be influenced by adding an
ORDER BY
clause.
In plain English this means that your query without GROUP BY
returns the correct value for followers
but the value returned for name
can be different on different executions of the same query. You cannot observe this behaviour if you run the query multiple times but chances are it will happen after you add or remove rows from the table or you backup the table, truncate it then restore it from the backup (or recreate it on a different machine or different version of MySQL).
来源:https://stackoverflow.com/questions/40503132/mysql-count-only-returning-one-result-unless-using-group-by