similarity | 易学教程

Getting most similar rows in MySQL table and order them by similarity

阅读更多关于 Getting most similar rows in MySQL table and order them by similarity

问题 I have a database table that holds user's vehicles (cars, motorcycles). I want to get the most similar vehicles out of that table. Lets say the table holds the following columns (with some context to get the idea): table: vehicles vehicle_id (pk, auto-increment) model_id (BMW 3er, Honda Accord) fuel_type (gasoline, diesel) body_style (sedan, coupe) year engine_size (2.0L) engine_power (150hp) So in short I want to select N (usually 3) rows that have the same make_id (at least) and rank them

Detecting similarity in two video files

阅读更多关于 Detecting similarity in two video files

问题 I am working on detecting similarity between 2 videos in Java. The user will suggest two videos, and software has to detect whether they are similar by checking the file content. I read that it is possible to compare each frame of the 2 videos. Can anyone please share any suitable algorithms (or code or methods) that can be implemented in Java? 回答1: There is a huge variety of algorithms for determining similarity in images. A search for image similarity algorithm and video similarity

Matlab calculate 3D similarity transformation. fitgeotrans for 3D

阅读更多关于 Matlab calculate 3D similarity transformation. fitgeotrans for 3D

问题 How can I calculate in MatLab similarity transformation between 4 points in 3D? I can calculate transform matrix from T*X = Xp , but it will give me affine matrix due to small errors in points coordinates. How can I fit that matrix to similarity one? I need something like fitgeotrans , but in 3D Thanks 回答1: The answer by @rayryeng is correct, given that you have a set of up to 3 points in a 3-dimensional space. If you need to transform m points in n-dimensional space ( m>n ), then you first

compare arrays of two different lengths

阅读更多关于 compare arrays of two different lengths

问题 I am developing a program on Android that will compare the similarity of Gestures using Gesture Points. I have two arrays like this: gest_1 = [120,333,453,564,234,531] gest_2 = [222,432,11,234,223,344,534,523,432,234] I know there is no way to dynamically resize either one of the arrays, so is there any way for me to compare both these gestures using these arrays and return the similarity? Note that the data in the arrays are just randomly typed out. 回答1: You could try something like this:

Similarity between two data sets or arrays

阅读更多关于 Similarity between two data sets or arrays

问题 Let's say I have a dataset that look like this: {A:1, B:3, C:6, D:6} I also have a list of other sets to compare my specific set: {A:1, B:3, C:6, D:6}, {A:2, B:3, C:6, D:6}, {A:99, B:3, C:6, D:6}, {A:5, B:1, C:6, D:9}, {A:4, B:2, C:2, D:6} My entries could be visualized as a Table (with four columns, A, B, C, D, and E). How can I find the set with the most similarity? For this example, row 1 is a perfect match and row 2 is a close second, while row 3 is quite far away. I am thinking of

Algorithm for finding similar images using an index

阅读更多关于 Algorithm for finding similar images using an index

问题 There are some surprisingly good image compare tools which find similar image even if it's not exactly the same (eg. change in size, wallpaper, brightness/contrast). I have some example applications here: Unique Filer 1.4 (shareware): https://web.archive.org/web/20010309014927/http://uniquefiler.com/ Fast Duplicate File Finder (Freeware): http://www.mindgems.com/products/Fast-Duplicate-File-Finder/Fast-Duplicate-File-Finder-About.htm Visual similarity duplicate image finder (payware): http:/

How do I group similar strings in R? [closed]

阅读更多关于 How do I group similar strings in R? [closed]

问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 6 years ago . I have a database with ~5,000 locality names, most of which are repetitions with typos, permutations, abreviations, etc. I would like to group them by similarity, to speed up further processing. The best would be to convert each variation into a "platonic form", and put two columns side by side, with the

Is there an alternative to `difflib.get_close_matches()` that returns indexes (list positions) instead of a str list?

阅读更多关于 Is there an alternative to `difflib.get_close_matches()` that returns indexes (list positions) instead of a str list?

问题 I want to use something like difflib.get_close_matches but instead of the most similar strings, I would like to obtain the indexes (i.e. position in the list). The indexes of the list are more flexible because one can relate the index to other data structures (related to the matched string). For example, instead of: >>> words = ['hello', 'Hallo', 'hi', 'house', 'key', 'screen', 'hallo', 'question', 'format'] >>> difflib.get_close_matches('Hello', words) ['hello', 'hallo', 'Hallo'] I would

Systematic threshold for cosine similarity with TF-IDF weights

阅读更多关于 Systematic threshold for cosine similarity with TF-IDF weights

问题 I am running an analysis of several thousand (e.g., 10,000) text documents. I have computed TF-IDF weights and have a matrix with pairwise cosine similarities. I want to treat the documents as a graph to analyze various properties (e.g., the path length separating groups of documents) and to visualize the connections as a network. The problem is that there are too many similarities. Most are too small to be meaningful. I see many people dealing with this problem by dropping all similarities

Similar images - how to compare them

阅读更多关于 Similar images - how to compare them

问题 I have over 1.3milion images that I have to compare with each other, and a few hundreds per day are added. My company take an image and create a version that can be utilized by our vendors. The files are often very similar to each other, for example two different companies can send us two different images, a JPG and a GIF, both with the McDonald Logo, with months between the submissions. What is happening is that at the end we find ourselves creating two different times the same logo when we