Get most common value for each value of another column in SQL

前端未结

关注

 9  1587

I have a table like this:

 Column  | Type | Modifiers 
---------+------+-----------
 country | text | 
 food_id | int  | 
 eaten   | date |

相关标签:

9条回答

萌比男神i

2020-11-30 02:56

try this:

Select Country, Food_id
From Munch T1
Where Food_id= 
    (Select Food_id
     from Munch T2
     where T1.Country= T2.Country
     group by Food_id
     order by count(Food_id) desc
      limit 1)
group by Country, Food_id

0 讨论(0)

粉色の甜心

2020-11-30 02:59

Here is a statement which I believe gives you what you want and is simple and concise:

select distinct on (country) country, food_id
from munch
group by country, food_id
order by country, count(*) desc

Please let me know what you think.

BTW, the distinct on feature is only available in Postgres.

Example, source data:

country | food_id | eaten
US        1         2017-1-1
US        1         2017-1-1
US        2         2017-1-1
US        3         2017-1-1
GB        3         2017-1-1
GB        3         2017-1-1
GB        2         2017-1-1

output:

country | food_id
US        1
GB        3

0 讨论(0)

悲哀的现实

2020-11-30 03:02

SELECT country, MAX( food_id )
  FROM( SELECT m1.country, m1.food_id
          FROM munch m1
         INNER JOIN ( SELECT country
                           , food_id
                           , COUNT(*) as food_counts
                        FROM munch m2
                    GROUP BY country, food_id ) as m3
                 ON m1.country = m3.country
         GROUP BY m1.country, m1.food_id 
        HAVING COUNT(*) / COUNT(DISTINCT m3.food_id) = MAX(food_counts) ) AS max_foods
  GROUP BY country

I don't like the MAX(.) GROUP BY to break ties... There's gotta be a way to incorporate eaten date into the JOIN in some way to arbitrarily select the most recent one...

I'm interested on the query plan for this thing if you run it on your live data!

0 讨论(0)

再見小時候

2020-11-30 03:05

Try something like this

select country, food_id, count(*) cnt 
into #tempTbl 
from mytable 
group by country, food_id

select country, food_id
from  #tempTbl as x
where cnt = 
  (select max(cnt) 
  from mytable 
  where country=x.country 
  and food_id=x.food_id)

This could be put all into a single select, but I don't have time to muck around with it right now.

Good luck.

0 讨论(0)

故里飘歌

2020-11-30 03:07
PostgreSQL introduced support for window functions in 8.4, the year after this question was asked. It's worth noting that it might be solved today as follows:
```
SELECT country, food_id
  FROM (SELECT country, food_id, ROW_NUMBER() OVER (PARTITION BY country ORDER BY freq DESC) AS rn
          FROM (  SELECT country, food_id, COUNT('x') AS freq
                    FROM country_foods
                GROUP BY 1, 2) food_freq) ranked_food_req
 WHERE rn = 1;
```
The above will break ties. If you don't want to break ties, you could use DENSE_RANK() instead.
0 讨论(0)
发布评论:

提交评论
- 加载中...
北海茫月

2020-11-30 03:10
It is now even simpler: PostgreSQL 9.4 introduced the mode() function:
```
select mode() within group (order by food_id)
from munch
group by country
```
returns (like user2247323's example):
```
country | mode
--------------
GB      | 3
US      | 1
```
See documentation here: https://wiki.postgresql.org/wiki/Aggregate_Mode

https://www.postgresql.org/docs/current/static/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页