similarity

Calculate similarity between list of words

走远了吗. 提交于 2019-12-06 16:24:08
问题 I want to calculate the similarity between two list of words, for example : ['email','user','this','email','address','customer'] is similar to this list: ['email','mail','address','netmail'] I want to have a higher percentage of similarity than another list, for example: ['address','ip','network'] even if address exists in the list. 回答1: Since you haven't really been able to demonstrate a crystal output, here is my best shot: list_A = ['email','user','this','email','address','customer'] list

Matlab calculate 3D similarity transformation. fitgeotrans for 3D

雨燕双飞 提交于 2019-12-06 15:27:51
How can I calculate in MatLab similarity transformation between 4 points in 3D? I can calculate transform matrix from T*X = Xp , but it will give me affine matrix due to small errors in points coordinates. How can I fit that matrix to similarity one? I need something like fitgeotrans , but in 3D Thanks The answer by @rayryeng is correct, given that you have a set of up to 3 points in a 3-dimensional space. If you need to transform m points in n-dimensional space ( m>n ), then you first need to add m-n coordinates to these m points such that they exist in m-dimensional space (i.e. the a matrix

Best way to rank sentences based on similarity from a set of Documents

此生再无相见时 提交于 2019-12-06 14:56:17
问题 I want to know the best way to rank sentences based on similarity from a set of documents. For e.g lets say, 1. There are 5 documents. 2. Each document contains many sentences. 3. Lets take Document 1 as primary, i.e output will contain sentences from this document. 4. Output should be list of sentences ranked in such a way that sentence with FIRST rank is the most similar sentence in all 5 documents, then 2nd then 3rd... Thanks in advance. 回答1: I'll cover the basics of textual document

Detecting similarity in two video files

丶灬走出姿态 提交于 2019-12-06 14:38:35
I am working on detecting similarity between 2 videos in Java. The user will suggest two videos, and software has to detect whether they are similar by checking the file content. I read that it is possible to compare each frame of the 2 videos. Can anyone please share any suitable algorithms (or code or methods) that can be implemented in Java? thkala There is a huge variety of algorithms for determining similarity in images. A search for image similarity algorithm and video similarity algorithm in Google Scholar will produce a large number of related papers - there are also a few questions (e

Appropriate similarity metrics for multiple sets of 2D coordinates

删除回忆录丶 提交于 2019-12-06 13:22:25
I have a collection of 2D coordinate sets (on the scale of a 100K-500K points in each set) and I am looking for the most efficient way to measure the similarity of 1 set to the other. I know of the usuals: Cosine, Jaccard/Tanimoto, etc. However I am hoping for some suggestions on any fast/efficient ones to measure similarity, especially ones that can cluster by similarity. Edit 1: The image shows what I need to do. I need to cluster all the reds, blues and greens by their shape/orientatoin, etc. alt text http://img402.imageshack.us/img402/8121/curves.png It seems that the first step of any

MySQL Query to find most similar numerical row

好久不见. 提交于 2019-12-06 12:40:10
问题 In a MySQL database, I am attempting to find the most similar row across a number of numerical attributes. This problem is similar to this question but includes a flexible number of comparisons and a join table. Database The database consists of two tables. The first table, users, is what I'm trying to compare. id | self_ranking ---------------------------------- 1 | 9 2 | 3 3 | 2 The second table is a series of scores which the user gave to particular items. id | user_id | item_id | score --

How does Stack Overflow display similar questions when you type in a new q​uestion?

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-06 09:54:58
This is one of the things that Stack Overflow and the rest of the sites that run on this platform do very well. As soon as you try to create a new question, a little window is shown that shows other similar questions. How is this done? What technology can be used to achieve this? Lucene, Sphinx, ...? kprobst StackOverflow (and StackExchange in general) uses Lucene.net for full-text search. Might want to read this as well. 来源: https://stackoverflow.com/questions/5208130/how-does-stack-overflow-display-similar-questions-when-you-type-in-a-new-questi

How I can write SPARQL query that uses similarity measures in Java Code

北城余情 提交于 2019-12-06 08:48:05
I would like to know a simple method to write this SPARQL query in Java Code: select ?input ?string (strlen(?match)/strlen(?string) as ?percent) where { values ?string { "London" "Londn" "London Fog" "Lando" "Land Ho!" "concatenate" "catnap" "hat" "cat" "chat" "chart" "port" "part" } values (?input ?pattern ?replacement) { ("cat" "^x[^cat]*([c]?)[^at]*([a]?)[^t]*([t]?).*$" "$1$2$3") ("Londn" "^x[^Londn]*([L]?)[^ondn]*([o]?)[^ndn]*([n]?)[^dn]*([d]?)[^n]*([n]?).*$" "$1$2$3$4$5") } bind( replace( concat('x',?string), ?pattern, ?replacement) as ?match ) } order by ?pattern desc(?percent) This code

compare arrays of two different lengths

北战南征 提交于 2019-12-06 06:46:14
I am developing a program on Android that will compare the similarity of Gestures using Gesture Points. I have two arrays like this: gest_1 = [120,333,453,564,234,531] gest_2 = [222,432,11,234,223,344,534,523,432,234] I know there is no way to dynamically resize either one of the arrays, so is there any way for me to compare both these gestures using these arrays and return the similarity? Note that the data in the arrays are just randomly typed out. You could try something like this: List similarities = new ArrayList(); for(int i = 0; i < Math.max(gest_1.length, gest_2.length); i++){ if (gest

Similarity function for Mahout boolean user-based recommender

孤人 提交于 2019-12-06 06:43:30
I am using Mahout to build a user-based recommendation system which operates with boolean data. I use GenericBooleanPrefUserBasedRecommender , NearestNUserNeighborhood and now trying to decide about the most suitable user similarity function. It was suggested to use either LogLikelihoodSimilarity or TanimotoCoefficientSimilarity . I tried both and am getting [subjectively evaluated] meaningful results in both cases. However the RMSE rating for the same data set is better the LogLikehood. The number of "no recommendation" is similar in both case. Can anyone recommend which of these similarity