Get most common value for each value of another column in SQL

前端 未结 9 1575
生来不讨喜
生来不讨喜 2020-11-30 02:29

I have a table like this:

 Column  | Type | Modifiers 
---------+------+-----------
 country | text | 
 food_id | int  | 
 eaten   | date | 
<
相关标签:
9条回答
  • 2020-11-30 02:56

    try this:

    Select Country, Food_id
    From Munch T1
    Where Food_id= 
        (Select Food_id
         from Munch T2
         where T1.Country= T2.Country
         group by Food_id
         order by count(Food_id) desc
          limit 1)
    group by Country, Food_id
    
    0 讨论(0)
  • 2020-11-30 02:59

    Here is a statement which I believe gives you what you want and is simple and concise:

    select distinct on (country) country, food_id
    from munch
    group by country, food_id
    order by country, count(*) desc
    

    Please let me know what you think.

    BTW, the distinct on feature is only available in Postgres.

    Example, source data:

    country | food_id | eaten
    US        1         2017-1-1
    US        1         2017-1-1
    US        2         2017-1-1
    US        3         2017-1-1
    GB        3         2017-1-1
    GB        3         2017-1-1
    GB        2         2017-1-1
    

    output:

    country | food_id
    US        1
    GB        3
    
    0 讨论(0)
  • 2020-11-30 03:02
    SELECT country, MAX( food_id )
      FROM( SELECT m1.country, m1.food_id
              FROM munch m1
             INNER JOIN ( SELECT country
                               , food_id
                               , COUNT(*) as food_counts
                            FROM munch m2
                        GROUP BY country, food_id ) as m3
                     ON m1.country = m3.country
             GROUP BY m1.country, m1.food_id 
            HAVING COUNT(*) / COUNT(DISTINCT m3.food_id) = MAX(food_counts) ) AS max_foods
      GROUP BY country
    

    I don't like the MAX(.) GROUP BY to break ties... There's gotta be a way to incorporate eaten date into the JOIN in some way to arbitrarily select the most recent one...

    I'm interested on the query plan for this thing if you run it on your live data!

    0 讨论(0)
  • 2020-11-30 03:05

    Try something like this

    select country, food_id, count(*) cnt 
    into #tempTbl 
    from mytable 
    group by country, food_id
    
    select country, food_id
    from  #tempTbl as x
    where cnt = 
      (select max(cnt) 
      from mytable 
      where country=x.country 
      and food_id=x.food_id)
    

    This could be put all into a single select, but I don't have time to muck around with it right now.

    Good luck.

    0 讨论(0)
  • 2020-11-30 03:07

    PostgreSQL introduced support for window functions in 8.4, the year after this question was asked. It's worth noting that it might be solved today as follows:

    SELECT country, food_id
      FROM (SELECT country, food_id, ROW_NUMBER() OVER (PARTITION BY country ORDER BY freq DESC) AS rn
              FROM (  SELECT country, food_id, COUNT('x') AS freq
                        FROM country_foods
                    GROUP BY 1, 2) food_freq) ranked_food_req
     WHERE rn = 1;
    

    The above will break ties. If you don't want to break ties, you could use DENSE_RANK() instead.

    0 讨论(0)
  • 2020-11-30 03:10

    It is now even simpler: PostgreSQL 9.4 introduced the mode() function:

    select mode() within group (order by food_id)
    from munch
    group by country
    

    returns (like user2247323's example):

    country | mode
    --------------
    GB      | 3
    US      | 1
    

    See documentation here: https://wiki.postgresql.org/wiki/Aggregate_Mode

    https://www.postgresql.org/docs/current/static/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE

    0 讨论(0)
提交回复
热议问题