ranking

How do I preserve continuous (1,2,3,…n) ranking notation when ranking in R?

|▌冷眼眸甩不掉的悲伤 提交于 2019-11-29 12:12:53
If I want to rank a set of numbers using the minimum rank for shared cases (aka ties): dat <- c(13,13,14,15,15,15,15,15,15,16,17,22,45,46,112) rank(dat, ties = 'min') I get the results: 1 1 3 4 4 4 4 4 4 10 11 12 13 14 15 However, I want the rank to be a continuous series consisting of 1,2,3,... n , where n is the number of unique ranks . Is there a way to make rank (or a similar function) rank a series of numbers by assigning ties to the lowest rank as above but instead of skipping subsequent rank values by the number of previous ties to instead continue ranking from the previous rank ? For

MySQL Rank with ties

孤街浪徒 提交于 2019-11-29 07:01:44
I am new to sql and I have never used variables or conditions in mysql, but know that from other programming languages. Since a few days I try to find a way to rank a user score. I read a lot of articles, and also questions that asked on stackoverflow and finally I found a solution that nearly does it like I want it. SELECT score_users.uid, score_users.score, @prev := @curr, @curr := score, @rank := IF(@prev = @curr, @rank, @rank +1) AS rank FROM score_users, (SELECT @curr := null, @prev := null, @rank := 0) tmp_tbl WHERE score_users.matchday = 1 ORDER BY score_users.score DESC But my Problems

mysql: group by ID, get highest priority per each ID

我与影子孤独终老i 提交于 2019-11-29 02:37:40
I have the following mysql table called "pics", with the following fields and sample data: id vehicle_id filename priority 1 45 a.jpg 4 2 45 b.jpg 1 3 56 f.jpg 4 4 67 cc.jpg 4 5 45 kt.jpg 3 6 67 gg.jpg 1 Is it possible, in a single query, to get one row for each vehicle_id, and the row be the highest priority? The result I'm looking for: array ( [0] => array( [id] => '2', [vehicle_id] => '45', [filename] => 'b.jpg', [priority] => '1' ), [1] => array( [id] => '3', [vehicle_id] => '56', [filename] => 'f.jpg', [priority] => '4' ), [2] => array( [id] => '6', [vehicle_id] => '67', [filename] => 'gg

Simplest way to plot changes in ranking between two ordered lists in R?

ぃ、小莉子 提交于 2019-11-29 02:04:05
I'm wondering if there is an easy way to plot the changes in position of elements between 2 lists in the form of a directed bipartite graph in R. For example, list 1 and 2 are vectors of character strings, not necessarily containing the same elements: list.1 <- c("a","b","c","d","e","f","g") list.2 <- c("b","x","e","c","z","d","a") I would like to generate something similar to: I've had a slight bash at using the igraph package, but couldn't easily construct what I would like, which I imagine and hope shouldn't be too hard. Cheers. Here is a simple function to do what you want. Essentially it

How to remove duplicate search result in elasticsearch?

我是研究僧i 提交于 2019-11-29 02:02:50
First Create some example data (e1,e2,e3 are types and test is the index name): PUT test/e1/1 { "id":1 "subject": "subject 1" } PUT test/e2/1 { "id":1 "subject": "subject 2" } PUT test/e3/2 { "id":2 "subject": "subject 3" } Now my question is: how can I get just these two data? remove duplicate data with the same id in the curl -XGET _search result. test/e1/1 { "id":1 "subject": "subject 1" } test/e3/2 { "id":2 "subject": "subject 3" } Francois Combet First you will need to search across multiple index. Then, on the result remove the duplicate ID. POST http://myElastic.com/test/e1,e2,e3/

How to balance number of ratings versus the ratings themselves?

百般思念 提交于 2019-11-28 22:24:16
问题 For a school project, we'll have to implement a ranking system. However, we figured that a dumb rank average would suck: something that one user ranked 5 stars would have a better average that something 188 users ranked 4 stars, and that's just stupid. So I'm wondering if any of you have an example algorithm of "smart" ranking. It only needs to take in account the rankings given and the number of rankings. Thanks! 回答1: You can use a method inspired by Bayesian probability. The gist of the

Percentage rank of matches using Levenshtein Distance matching

只愿长相守 提交于 2019-11-28 20:20:21
I am trying to match a single search term against a dictionary of possible matches using a Levenshtein distance algorithm. The algorithm returns a distance expressed as number of operations required to convert the search string into the matched string. I want to present the results in ranked percentage list of top "N" (say 10) matches. Since the search string can be longer or shorter than the individual dictionary strings, what would be an appropriate logic for expressing the distance as a percentage, which would qualitatively refelct how close "as a percentage" is each result to the query

Ranking with millions of entries

|▌冷眼眸甩不掉的悲伤 提交于 2019-11-28 16:00:22
I'm working on a server for an online game which should be able to handle millions of players. Now the game needs leaderboards and wants to be able to show a players current position and possibly other players near the current players position as well as the positions of the players friends. Now I've done this stuff before in MySQL and I know how it's technically possible, however I figured since this is a common practice for a lot of online games there must be existing libraries or databases particularly for this purpose? Can anyone advice me what database is the best for these types of

Hot content algorithm / score with time decay

守給你的承諾、 提交于 2019-11-28 15:15:15
I have been reading + researching on algorithms and formulas to work out a score for my user submitted content to display currently hot / trending items higher up the list, however i'll admit i'm a little over my head here. I'll give some background on what i'm after... users upload audio to my site, audios have several actions: Played Downloaded Liked Favorited Ideally i want an algorithm where I can update an audios score each time a new activity is logged (played, download etc...), also a download action is worth more than a play, like more than a download and a favourite more than a like.

Finding number of elements in one vector that are less than an element in another vector

Deadly 提交于 2019-11-28 13:53:39
Say we have a couple vectors a <- c(1, 2, 2, 4, 7) b <- c(1, 2, 3, 5, 7) For each element b[i] in b I want find the number of elements in a that's less than b[i] , or, equivalent, I want to know the rank of b_i in c(b[i], a) . there are a couple naive ways I can think of, e.g. doing either of the following length(b) times: min_rank(c(b[i], a)) sum(a < b[i]) What's the best way to do this if length(a) = length(b) = N where N is large? EDIT: To clarify, I'm wondering if there's a more computationally efficient way to do this, i.e. if I can do better than quadratic time in this case.