问题
I'm not much of a database guru so I need some help on a query I'm working on. In my photo community project I want to richly visualize tags by not only showing the tag name and counter (# of images inside them), I also want to show a thumb of the most popular image inside the tag (most karma).
The table setup is as follow:
- Image table holds basic image metadata, important is the karma field
- Imagefile table holds multiple entries per image, one for each format
- Tag table holds tag definitions
- Tag_map table maps tags to images
In my usual trial and error query authoring I have come this far:
SELECT * FROM
(SELECT tag.name, tag.id, COUNT(tag_map.tag_id) as cnt
FROM tag INNER JOIN tag_map ON (tag.id = tag_map.tag_id)
INNER JOIN image ON tag_map.image_id = image.id
INNER JOIN imagefile on image.id = imagefile.image_id
WHERE imagefile.type = 'smallthumb'
GROUP BY tag.name
ORDER BY cnt DESC)
as T1 WHERE cnt > 0 ORDER BY cnt DESC
[column clause of inner query snipped for the sake of simplicity]
This query gives me somewhat what I need. The outer query makes sure that only tags are returned for which there is at least 1 image. The inner query returns the tag details, such as its name, count (# of images) and the thumb. In addition, I can sort the inner query as I want (by most images, alphabetically, most recent, etc)
So far so good. The problem however is that this query does not match the most popular image (most karma) of the tag, it seems to always take the most recent one in the tag.
How can I make sure that the most popular image is matched with the tag?
回答1:
This should be pretty close:
SELECT
tag.id,
tag.name,
tag_group.cnt,
tag_group.max_karma,
image.id,
imagefile.filename
/* ... */
FROM
tag
/* join against a list of max karma values (per tag) */
INNER JOIN (
SELECT MAX(image.karma) AS max_karma, COUNT(image.*) cnt, tag_map.tag_id
FROM image
INNER JOIN tag_map ON tag_map.image_id = image.id
GROUP BY tag_map.tag_id
) AS tag_group ON tag_group.tag_id = tag.id
/* join against a list of image ids (per max karma value and tag) */
INNER JOIN (
SELECT MAX(image.id) id, tag_map.tag_id, image.karma
FROM image
INNER JOIN tag_map ON tag_map.image_id = image.id
GROUP BY tag_map.tag_id, image.karma /* collapse >1 imgs with same karma */
) AS pop_img ON pop_img.tag_id = tag.id AND pop_img.karma = tag_group.max_karma
/* join against actual base data (per popular image id) */
INNER JOIN
image ON image.id = pop_img.id
INNER JOIN
imagefile ON imagefile.image_id = pop_img.id AND imagefile.type = 'smallthumb'
Basically, this is the ever-recurring "max-per-group" problem: How can I select the record that corresponds to the maximum/minimum value of a group?
And the general answer always is along the lines of: Select your group (tag_id, MAX(image.karma)
) and then join your base data against these characteristics. There may be DBMS-specific proprietary extensions that take a different approach, for example using ROW_NUMBER()
/PARTITION BY
. However, these are not very portable and may leave you scratching your head when working with a DBMS that does not support them.
回答2:
You are looking for the group by 'having' clause, not nested selects!
SELECT tag.name, tag.id, COUNT(tag_map.tag_id) as cnt
FROM tag
INNER JOIN tag_map
ON (tag.id = tag_map.tag_id)
INNER JOIN image
ON tag_map.image_id = image.id
INNER JOIN imagefile
on image.id = imagefile.image_id
WHERE imagefile.type = 'smallthumb'
GROUP BY tag.name HAVING COUNT(tag_map.tag_id) > 0
ORDER BY cnt DESC
来源:https://stackoverflow.com/questions/2890991/can-i-join-two-tables-whereby-the-joined-table-is-sorted-by-a-certain-column