问题
I have a table like this:
Column | Type | Modifiers
---------+------+-----------
country | text |
food_id | int |
eaten | date |
And for each country, I want to get the food that is eaten most often. The best I can think of (I'm using postgres) is:
CREATE TEMP TABLE counts AS
SELECT country, food_id, count(*) as count FROM munch GROUP BY country, food_id;
CREATE TEMP TABLE max_counts AS
SELECT country, max(count) as max_count FROM counts GROUP BY country;
SELECT country, max(food_id) FROM counts
WHERE (country, count) IN (SELECT * from max_counts) GROUP BY country;
In that last statement, the GROUP BY and max() are needed to break ties, where two different foods have the same count.
This seems like a lot of work for something conceptually simple. Is there a more straight forward way to do it?
回答1:
PostgreSQL introduced support for window functions in 8.4, the year after this question was asked. It's worth noting that it might be solved today as follows:
SELECT country, food_id
FROM (SELECT country, food_id, ROW_NUMBER() OVER (PARTITION BY country ORDER BY freq DESC) AS rn
FROM ( SELECT country, food_id, COUNT('x') AS freq
FROM country_foods
GROUP BY 1, 2) food_freq) ranked_food_req
WHERE rn = 1;
The above will break ties. If you don't want to break ties, you could use DENSE_RANK() instead.
回答2:
It is now even simpler: PostgreSQL 9.4 introduced the mode()
function:
select mode() within group (order by food_id)
from munch
group by country
returns (like user2247323's example):
country | mode
--------------
GB | 3
US | 1
See documentation here: https://wiki.postgresql.org/wiki/Aggregate_Mode
https://www.postgresql.org/docs/current/static/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE
回答3:
SELECT DISTINCT
"F1"."food",
"F1"."country"
FROM "foo" "F1"
WHERE
"F1"."food" =
(SELECT "food" FROM
(
SELECT "food", COUNT(*) AS "count"
FROM "foo" "F2"
WHERE "F2"."country" = "F1"."country"
GROUP BY "F2"."food"
ORDER BY "count" DESC
) AS "F5"
LIMIT 1
)
Well, I wrote this in a hurry and didn't check it really well. The sub-select might be pretty slow, but this is shortest and most simple SQL statement that I could think of. I'll probably tell more when I'm less drunk.
PS: Oh well, "foo" is the name of my table, "food" contains the name of the food and "country" the name of the country. Sample output:
food | country
-----------+------------
Bratwurst | Germany
Fisch | Frankreich
回答4:
try this:
Select Country, Food_id
From Munch T1
Where Food_id=
(Select Food_id
from Munch T2
where T1.Country= T2.Country
group by Food_id
order by count(Food_id) desc
limit 1)
group by Country, Food_id
回答5:
Try something like this
select country, food_id, count(*) cnt
into #tempTbl
from mytable
group by country, food_id
select country, food_id
from #tempTbl as x
where cnt =
(select max(cnt)
from mytable
where country=x.country
and food_id=x.food_id)
This could be put all into a single select, but I don't have time to muck around with it right now.
Good luck.
回答6:
Here's how to do it without any temp tables:
Edit: simplified
select nf.country, nf.food_id as most_frequent_food_id
from national_foods nf
group by country, food_id
having
(country,count(*)) in (
select country, max(cnt)
from
(
select country, food_id, count(*) as cnt
from national_foods nf1
group by country, food_id
)
group by country
having country = nf.country
)
回答7:
SELECT country, MAX( food_id )
FROM( SELECT m1.country, m1.food_id
FROM munch m1
INNER JOIN ( SELECT country
, food_id
, COUNT(*) as food_counts
FROM munch m2
GROUP BY country, food_id ) as m3
ON m1.country = m3.country
GROUP BY m1.country, m1.food_id
HAVING COUNT(*) / COUNT(DISTINCT m3.food_id) = MAX(food_counts) ) AS max_foods
GROUP BY country
I don't like the MAX(.) GROUP BY to break ties... There's gotta be a way to incorporate eaten date into the JOIN in some way to arbitrarily select the most recent one...
I'm interested on the query plan for this thing if you run it on your live data!
回答8:
select country,food_id, count(*) ne
from food f1
group by country,food_id
having count(*) = (select max(count(*))
from food f2
where country = f1.country
group by food_id)
回答9:
Here is a statement which I believe gives you what you want and is simple and concise:
select distinct on (country) country, food_id
from munch
group by country, food_id
order by country, count(*) desc
Please let me know what you think.
BTW, the distinct on feature is only available in Postgres.
Example, source data:
country | food_id | eaten
US 1 2017-1-1
US 1 2017-1-1
US 2 2017-1-1
US 3 2017-1-1
GB 3 2017-1-1
GB 3 2017-1-1
GB 2 2017-1-1
output:
country | food_id
US 1
GB 3
来源:https://stackoverflow.com/questions/344665/get-most-common-value-for-each-value-of-another-column-in-sql