Get Common Rows Within The Same Table

杀马特。学长 韩版系。学妹 提交于 2019-12-23 17:24:59

问题


I've had a bit of a search, but didn't find anything quite like what I'm trying to achieve.

Basically, I'm trying to find a similarity between two users' voting habits.

I have a table storing each individual vote made, which stores:

voteID
itemID     (the item the vote is attached to)
userID     (the user who voted)
direction  (whether the user voted the post up, or down)

I'm aiming to calculate the similarity between, say, users A and B, by finding out two things:

  1. The number of votes they have in common. That is, the number of times they've both voted on the same post (the direction does not matter at this point).
  2. The number of times they've voted in the same direction, on common votes.

(Then simply to calculate #2 as a percentage of #1, to achieve a crude similarity rating).

My question is, how do I find the intersection between the two users' sets of votes? (i.e. how do I calculate point #1 adequately, without looping over every vote in a highly inefficient way.) If they were in different tables, an INNER JOIN would suffice, I'd imagine... but that obviously won't work on the same table (or will it?).

Any ideas would be greatly appreciated.


回答1:


Something like this:

SELECT COUNT(*)
FROM votes v1
INNER JOIN votes v2 ON (v1.item_id = v2.item_id)
WHERE v1.userID = 'userA'
AND v2.userUD = 'userB'



回答2:


Here's an example that should get you closer:

SELECT COUNT(*)
FROM (
      SELECT u1.userID
      FROM vote u1, vote u2
      WHERE u1.itemID = u2.itemID
      AND u1.userID = user1
      AND u2.userID = user2)



回答3:


Assuming userID 1 being compared to userID 2

For finding how many votes they have in common:

SELECT COUNT(*)
FROM Votes AS v1
INNER JOIN Votes AS v2 ON (v2.userID = 2
                            AND v2.itemID = v1.itemID)
WHERE v1.userID = 1;

For finding when they also voted the same:

SELECT COUNT(*)
FROM Votes AS v1
INNER JOIN Votes AS v2 ON (v2.userID = 2
                            AND v2.itemID = v1.itemID
                            AND v2.direction = v1.direction)
WHERE v1.userID = 1;



回答4:


A self join is in order. Here it is with all you asked:

SELECT v1.userID user1, v2.userID user2,
  count(*) n_votes_in_common,
  sum(case when v1.direction = v2.direction then 1 else 0 end) n_votes_same_direction,
  (n_votes_same_direction * 100.0 / n_votes_in_common) crude_similarity_percent 
FROM votes v1
INNER JOIN votes v2
ON v1.item_id = v2.item_id



回答5:


In case you want to do this for a single user (rather than knowing both users at the start) to find to whom they are the closest match:

SELECT
     v2.userID,
     COUNT(*) AS matching_items,
     SUM(CASE WHEN v2.direction = v1.direction THEN 1 ELSE 0 END) AS matching_votes
FROM
     Votes v1
INNER JOIN Votes v2 ON
     v2.userID <> v1.userID AND
     v2.itemID = v1.itemID
WHERE
     v1.userID = @userID
GROUP BY
     v2.userID

You can then limit that however you see fit (return the top 10, top 20, all, etc.)

I haven't tested this yet, so let me know if it doesn't act as expected.




回答6:


You most certainly can join a table to itself. In fact, that's what you're going to have to do. You must use aliasing when joining a table to itself. If your table doesn't have a PK or FK, you'll have to use Union instead. Union will remove duplicates and Union All will not.



来源:https://stackoverflow.com/questions/513098/get-common-rows-within-the-same-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!