How to implement Twitter retweet action in my database

余生颓废 提交于 2019-12-03 07:46:41

IMO option #1 would be better. The query to join the tweet and retweet tables would not be at all complex and could be done via a left or inner join, depending on whether you want to show all tweets or only tweets which were retweeted. And the join query should be performant as the table is narrow, the columns being joined are ints, and they will each have indices due to the FK constraints.

Another recommendation is not to label all your columns with tweet or retweet, those can be inferred from the table in which the data is stored, for example:

tweet
    id
    user_id
    text
    created_at

retweet
    tweet_id
    user_id
    created_at

And sample joins:

# Return all tweets which have been retweeted
SELECT
    count(*),
    t.id
FROM
    tweet AS t
INNER JOIN retweet AS rt ON rt.tweet_id = t.id
GROUP BY
    t.id

# Return tweet and possible retweet data for a specific tweet
SELECT
    t.id
FROM
    tweet AS t
LEFT OUTER JOIN retweet AS rt ON rt.tweet_id = t.id
WHERE
    t.id = :tweetId

-- Update per request --

The following is demonstrative only, representing why I would opt for option #1, there are no foreign keys nor are there any indices, you will have to add these yourself. But the results should demonstrate that the joins won't be too painful.

CREATE TABLE `tweet` (
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `user_id` int(10) unsigned NOT NULL,
    `value` varchar(255) NOT NULL,
    `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=8 DEFAULT CHARSET=utf8

CREATE TABLE `retweet` (
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `tweet_id` int(10) unsigned NOT NULL,
    `user_id` int(10) unsigned NOT NULL,
    `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=3 DEFAULT CHARSET=utf8;

# Sample Rows

mysql> select * from tweet;
+----+---------+----------------+---------------------+
| id | user_id | value          | created_at          |
+----+---------+----------------+---------------------+
|  1 |       1 | User1 | Tweet1 | 2012-07-27 00:04:30 |
|  2 |       1 | User1 | Tweet2 | 2012-07-27 00:04:35 |
|  3 |       2 | User2 | Tweet1 | 2012-07-27 00:04:47 |
|  4 |       3 | User3 | Tweet1 | 2012-07-27 00:04:58 |
|  5 |       1 | User1 | Tweet3 | 2012-07-27 00:06:47 |
|  6 |       1 | User1 | Tweet4 | 2012-07-27 00:06:50 |
|  7 |       1 | User1 | Tweet5 | 2012-07-27 00:06:54 |
+----+---------+----------------+---------------------+

mysql> select * from retweet;
+----+----------+---------+---------------------+
| id | tweet_id | user_id | created_at          |
+----+----------+---------+---------------------+
|  1 |        4 |       1 | 2012-07-27 00:06:37 |
|  2 |        3 |       1 | 2012-07-27 00:07:11 |
+----+----------+---------+---------------------+

# Query to pull all tweets for user_id = 1, including retweets and order from newest to oldest

select * from (
    select t.* from tweet as t where user_id = 1
    union
    select t.* from tweet as t where t.id in (select tweet_id from retweet where user_id = 1))
a order by created_at desc;

mysql> select * from (select t.* from tweet as t where user_id = 1 union select t.* from tweet as t where t.id in (select tweet_id from retweet where user_id = 1)) a order by created_at desc;
+----+---------+----------------+---------------------+
| id | user_id | value          | created_at          |
+----+---------+----------------+---------------------+
|  7 |       1 | User1 | Tweet5 | 2012-07-27 00:06:54 |
|  6 |       1 | User1 | Tweet4 | 2012-07-27 00:06:50 |
|  5 |       1 | User1 | Tweet3 | 2012-07-27 00:06:47 |
|  4 |       3 | User3 | Tweet1 | 2012-07-27 00:04:58 |
|  3 |       2 | User2 | Tweet1 | 2012-07-27 00:04:47 |
|  2 |       1 | User1 | Tweet2 | 2012-07-27 00:04:35 |
|  1 |       1 | User1 | Tweet1 | 2012-07-27 00:04:30 |
+----+---------+----------------+---------------------+

Notice in the last set of results, that we were able to also include the retweets and display the retweet of #4 before the retweet of #3.

-- Update --

You can accomplish what you are asking for by changing the query a bit:

select * from (
    select t.id, t.value, t.created_at from tweet as t where user_id = 1
    union
    select t.id, t.value, rt.created_at from tweet as t inner join retweet as rt on rt.tweet_id = t.id where rt.user_id = 1)
a order by created_at desc;

mysql> select * from (select t.id, t.value, t.created_at from tweet as t where user_id = 1 union select t.id, t.value, rt.created_at from tweet as t inner join retweet as rt on rt.tweet_id = t.id where rt.user_id = 1) a order by created_at desc;
+----+----------------+---------------------+
| id | value          | created_at          |
+----+----------------+---------------------+
|  3 | User2 | Tweet1 | 2012-07-27 00:07:11 |
|  7 | User1 | Tweet5 | 2012-07-27 00:06:54 |
|  6 | User1 | Tweet4 | 2012-07-27 00:06:50 |
|  5 | User1 | Tweet3 | 2012-07-27 00:06:47 |
|  4 | User3 | Tweet1 | 2012-07-27 00:06:37 |
|  2 | User1 | Tweet2 | 2012-07-27 00:04:35 |
|  1 | User1 | Tweet1 | 2012-07-27 00:04:30 |
+----+----------------+---------------------+

I would choose option 2 with slight modification. Column parent_id in tweets table should point to itself if it is not a retweet. Then, the querying will be extremely easy:

SELECT tm.Id, tm.UserId, tc.Text, tm.Created, 
    CASE WHEN tm.Id <> tc .Id THEN tm.UserId ELSE NULL END AS OriginalAsker
FROM tweet tm
LEFT JOIN tweet tc ON tm.ParentId = tc.Id
ORDER BY tm.Created DESC

(tc is parent table - the one with content.. it has tweet's text, original poster's Id, etc.)

The reason for introducing rule about pointing to itself if not retweet is that then it is easy to add more joins to original tweet. You just join a table with tc and don't care if it is retweet or not.

Not only the query is easy, but it will also perform much better than option 1, because sorting is done using only one physical column, which can be indexed.

The only drawback is that the DB will be a little bit larger.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!