MySQL GROUP BY behavior

倖福魔咒の 提交于 2019-12-17 06:43:31

问题


Given the following table 'foo'

ID | First Name | Last Name
----------------------------
67   John        Smith
----------------------------
67   Bill        Jacobs

What first_name and last_name will the following query return and why?

SELECT * FROM foo WHERE ID = 67 GROUP BY ID

回答1:


MySQL chooses a row arbitrarily. In practice, commonly used MySQL storage engines return the values from the first row in the group, with respect to the physical storage.

create table foo (id serial primary key, category varchar(10));

insert into foo (category) values 
  ('foo'), ('foo'), ('foo'), ('bar'), ('bar'), ('bar');

select * from foo group by category;

+----+----------+
| id | category |
+----+----------+
|  4 | bar      |
|  1 | foo      |
+----+----------+

Other folks are correct that MySQL allows you to run this query even though it has arbitrary and potentially misleading results. The SQL standard, and most other RDBMS vendors, disallow this kind of ambiguous GROUP BY query. This is called the Single-Value Rule: all columns in the select-list must be explicitly part of the GROUP BY criteria, or else inside an aggregate function, e.g. COUNT(), MAX(), etc.

MySQL supports a SQL mode ONLY_FULL_GROUP_BY that makes MySQL return an error if you try to run a query that violates SQL standard semantics.

AFAIK, SQLite is the only other RDBMS that allows ambiguous columns in a grouped query. SQLite returns values from the last row in the group:

select * from foo group by category;

6|bar
3|foo

We can imagine queries that would not be ambiguous, yet still violate the SQL standard semantics.

SELECT foo.*, parent_of_foo.* 
FROM foo JOIN parent_of_foo 
  ON (foo.parent_id = parent_of_foo.parent_id) 
GROUP BY foo_id;

There's no logical way this could produce ambiguous results. Each row in foo gets its own group, if we GROUP BY the primary key of foo. So any column from foo can have only one value in the group. Even joining to another table referenced by a foreign key in foo can have only one value per group, if the groups are defined by the primary key of foo.

MySQL and SQLite trust you to design logically unambiguous queries. Formally, every column in the select-list must be a functional dependency of the columns in the GROUP BY criteria. If you don't adhere to this, it's your fault. :-)

Standard SQL is more strict and disallows some queries that could be unambiguous--probably because it would be too complex for the RDBMS to be sure in general.




回答2:


MySQLs group by is not consistent with the standard SQL behaviour , MySQL makes it easy to get other columns BUT at the same time u can never be sure which one u will get.

Update: refer to this page: http://dev.mysql.com/doc/refman/5.0/en/group-by-handling.html

When using this feature, all rows in each group should have the same values for the columns that are ommitted from the GROUP BY part. The server is free to return any value from the group, so the results are indeterminate unless all values are the same.




回答3:


It is undefined, which result you are going to get.

I was always wondering why this behaviour was even allowed. Really, I wish such code would just generate an error (preferrably, a decipherable one, none of that usual MySQL's "your statement has a problem, but I don't know where" stuff).




回答4:


It's very likely that the second (last) row's first name and last name will be picked.

You can add an ORDER BY clause to give hints on how you'd like the grouped rows to be sorted.




回答5:


In standard SQL, This SQL should fail, with a server processor error something like

"firstname, and lastname cannot be included in the select clause unless they are also in the group By, or are part of an aggregate function."

Does MySql actually return data for this ?




回答6:


MySQLs group by is not consistent with the standard SQL behaviour , MySQL makes it easy to get other columns BUT at the same time u can never be sure which one u will get.

True. Actually it corresponds more to SELECT DISTINCT ON mode in postgres for instance, except this allows you to specify the order of the rows prior to distinctization (?) and therefore which row you will get (ie most recent, oldest, whatever).

Note MySQL in "sql compliant" mode will reject GROUP BY with nonspecified columns as in your example.



来源:https://stackoverflow.com/questions/1645921/mysql-group-by-behavior

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!