MYSQL, Max,Group by and Max

问题

I am having the following two table.

1.Movie Detail (Movie-ID,Movie_Name,Rating,Votes,Year)

2.Movie Genre (Movie-ID,Genre)

I am using the following query to perform join and get the movie with highest rating in each genre.

select Movie_Name, 
    max(Rating) as Rating,
    Genre from movie_test 
inner join movie_genre 
where movie_test.Movie_ID = movie_genre.Movie_ID 
group by Genre

In the output Rating and Genre are correct but the Movie_Name is incorrect.

can anyone suggest what changes I should make to get the correct movie name along with rating and genre.

回答1:

SELECT  g.*, d.*
FROM    MovieGenre g
        INNER JOIN MovieDetail d
            ON g.MovieID = d.MovieID
        INNER JOIN
        (
            SELECT  a.Genre, MAX(b.Rating) maxRating
            FROM    MovieGenre a
                    INNER JOIN MovieDetail b
                        ON a.MovieID = b.MovieID
            GROUP   BY a.Genre
        ) sub ON    g.Genre = sub.Genre AND
                    d.rating = sub.maxRating

There is something wrong with your schema design. If a Movie can have many Genre as well as Genre can be contain on many Movie, it should be a three table design.

MovieDetails Table

MovieID (PK)
MovieName
MovieRating

Genre Table

GenreID (PK)
GenreName

Movie_Genre Table

MovieID (FK) -- compound primary key with GenreID
GenreID (FK)

回答2:

This is a common MySQL problem - specifying non-aggregate/non-aggregated-by columns in an aggregate query. Other flavours of SQL do not let you do this and will warn you.

When you do a query like yours, you are selecting non-aggregate columns in an aggregated group. Since many rows share the same genre, when you select Movie_Name it picks one row at random from each group and displays that one, because there is no general algorithm to guess the row you want and return the values of that.

You might ask 'why does it pick randomly? It could pick the one that max(Rating) belongs to?' but what about other aggregate columns, like avg(Rating)? What row does it pick there? What if two rows have the same max, anyway? Therefore it cannot have an algorithm to pick a row.

To solve a problem like this, you have to restructure your query, something like:

select Movie_Name, 
    Rating,
    Genre from movie_test mt
inner join movie_genre 
where movie_test.Movie_ID = movie_genre.Movie_ID 
and Rating = (select max(Rating) from movie_test mt2 where mt.Genre = mt2.Genre
group by Genre
limit 1

This will select the row with the rating being the same as the maximum rating for that genre, using a subquery.

回答3:

Query:

SELECT t.Movie_Name,
       t.Rating,
       g.Genre
FROM movie_test t
INNER JOIN movie_genre g ON t.Movie_ID = g.Movie_ID
WHERE t.Movie_ID = (SELECT t1.Movie_ID 
                    FROM movie_test t1
                    INNER JOIN movie_genre g1 ON t1.Movie_ID = g1.Movie_ID
                    WHERE g1.Genre = g.Genre
                    ORDER BY t1.Rating DESC
                    LIMIT 1)

来源：https://stackoverflow.com/questions/15679857/mysql-max-group-by-and-max

标签

mysql

sql

greatest-n-per-group