ranking | 易学教程

How do I preserve continuous (1,2,3,…n) ranking notation when ranking in R?

阅读更多关于 How do I preserve continuous (1,2,3,…n) ranking notation when ranking in R?

If I want to rank a set of numbers using the minimum rank for shared cases (aka ties): dat <- c(13,13,14,15,15,15,15,15,15,16,17,22,45,46,112) rank(dat, ties = 'min') I get the results: 1 1 3 4 4 4 4 4 4 10 11 12 13 14 15 However, I want the rank to be a continuous series consisting of 1,2,3,... n , where n is the number of unique ranks . Is there a way to make rank (or a similar function) rank a series of numbers by assigning ties to the lowest rank as above but instead of skipping subsequent rank values by the number of previous ties to instead continue ranking from the previous rank ? For

MySQL Rank with ties

阅读更多关于 MySQL Rank with ties

I am new to sql and I have never used variables or conditions in mysql, but know that from other programming languages. Since a few days I try to find a way to rank a user score. I read a lot of articles, and also questions that asked on stackoverflow and finally I found a solution that nearly does it like I want it. SELECT score_users.uid, score_users.score, @prev := @curr, @curr := score, @rank := IF(@prev = @curr, @rank, @rank +1) AS rank FROM score_users, (SELECT @curr := null, @prev := null, @rank := 0) tmp_tbl WHERE score_users.matchday = 1 ORDER BY score_users.score DESC But my Problems

mysql: group by ID, get highest priority per each ID

阅读更多关于 mysql: group by ID, get highest priority per each ID

I have the following mysql table called "pics", with the following fields and sample data: id vehicle_id filename priority 1 45 a.jpg 4 2 45 b.jpg 1 3 56 f.jpg 4 4 67 cc.jpg 4 5 45 kt.jpg 3 6 67 gg.jpg 1 Is it possible, in a single query, to get one row for each vehicle_id, and the row be the highest priority? The result I'm looking for: array ( [0] => array( [id] => '2', [vehicle_id] => '45', [filename] => 'b.jpg', [priority] => '1' ), [1] => array( [id] => '3', [vehicle_id] => '56', [filename] => 'f.jpg', [priority] => '4' ), [2] => array( [id] => '6', [vehicle_id] => '67', [filename] => 'gg

Simplest way to plot changes in ranking between two ordered lists in R?

阅读更多关于 Simplest way to plot changes in ranking between two ordered lists in R?

I'm wondering if there is an easy way to plot the changes in position of elements between 2 lists in the form of a directed bipartite graph in R. For example, list 1 and 2 are vectors of character strings, not necessarily containing the same elements: list.1 <- c("a","b","c","d","e","f","g") list.2 <- c("b","x","e","c","z","d","a") I would like to generate something similar to: I've had a slight bash at using the igraph package, but couldn't easily construct what I would like, which I imagine and hope shouldn't be too hard. Cheers. Here is a simple function to do what you want. Essentially it

How to remove duplicate search result in elasticsearch?

阅读更多关于 How to remove duplicate search result in elasticsearch?

First Create some example data (e1,e2,e3 are types and test is the index name): PUT test/e1/1 { "id":1 "subject": "subject 1" } PUT test/e2/1 { "id":1 "subject": "subject 2" } PUT test/e3/2 { "id":2 "subject": "subject 3" } Now my question is: how can I get just these two data? remove duplicate data with the same id in the curl -XGET _search result. test/e1/1 { "id":1 "subject": "subject 1" } test/e3/2 { "id":2 "subject": "subject 3" } Francois Combet First you will need to search across multiple index. Then, on the result remove the duplicate ID. POST http://myElastic.com/test/e1,e2,e3/

How to balance number of ratings versus the ratings themselves?

阅读更多关于 How to balance number of ratings versus the ratings themselves?

问题 For a school project, we'll have to implement a ranking system. However, we figured that a dumb rank average would suck: something that one user ranked 5 stars would have a better average that something 188 users ranked 4 stars, and that's just stupid. So I'm wondering if any of you have an example algorithm of "smart" ranking. It only needs to take in account the rankings given and the number of rankings. Thanks! 回答1: You can use a method inspired by Bayesian probability. The gist of the

Percentage rank of matches using Levenshtein Distance matching

阅读更多关于 Percentage rank of matches using Levenshtein Distance matching

I am trying to match a single search term against a dictionary of possible matches using a Levenshtein distance algorithm. The algorithm returns a distance expressed as number of operations required to convert the search string into the matched string. I want to present the results in ranked percentage list of top "N" (say 10) matches. Since the search string can be longer or shorter than the individual dictionary strings, what would be an appropriate logic for expressing the distance as a percentage, which would qualitatively refelct how close "as a percentage" is each result to the query

Ranking with millions of entries

阅读更多关于 Ranking with millions of entries

I'm working on a server for an online game which should be able to handle millions of players. Now the game needs leaderboards and wants to be able to show a players current position and possibly other players near the current players position as well as the positions of the players friends. Now I've done this stuff before in MySQL and I know how it's technically possible, however I figured since this is a common practice for a lot of online games there must be existing libraries or databases particularly for this purpose? Can anyone advice me what database is the best for these types of

Hot content algorithm / score with time decay

阅读更多关于 Hot content algorithm / score with time decay

I have been reading + researching on algorithms and formulas to work out a score for my user submitted content to display currently hot / trending items higher up the list, however i'll admit i'm a little over my head here. I'll give some background on what i'm after... users upload audio to my site, audios have several actions: Played Downloaded Liked Favorited Ideally i want an algorithm where I can update an audios score each time a new activity is logged (played, download etc...), also a download action is worth more than a play, like more than a download and a favourite more than a like.

Finding number of elements in one vector that are less than an element in another vector

阅读更多关于 Finding number of elements in one vector that are less than an element in another vector

Say we have a couple vectors a <- c(1, 2, 2, 4, 7) b <- c(1, 2, 3, 5, 7) For each element b[i] in b I want find the number of elements in a that's less than b[i] , or, equivalent, I want to know the rank of b_i in c(b[i], a) . there are a couple naive ways I can think of, e.g. doing either of the following length(b) times: min_rank(c(b[i], a)) sum(a < b[i]) What's the best way to do this if length(a) = length(b) = N where N is large? EDIT: To clarify, I'm wondering if there's a more computationally efficient way to do this, i.e. if I can do better than quadratic time in this case.