similarity

Getting most similar rows in MySQL table and order them by similarity

浪尽此生 提交于 2019-12-24 00:49:20
问题 I have a database table that holds user's vehicles (cars, motorcycles). I want to get the most similar vehicles out of that table. Lets say the table holds the following columns (with some context to get the idea): table: vehicles vehicle_id (pk, auto-increment) model_id (BMW 3er, Honda Accord) fuel_type (gasoline, diesel) body_style (sedan, coupe) year engine_size (2.0L) engine_power (150hp) So in short I want to select N (usually 3) rows that have the same make_id (at least) and rank them

Detecting similarity in two video files

时光毁灭记忆、已成空白 提交于 2019-12-23 03:29:18
问题 I am working on detecting similarity between 2 videos in Java. The user will suggest two videos, and software has to detect whether they are similar by checking the file content. I read that it is possible to compare each frame of the 2 videos. Can anyone please share any suitable algorithms (or code or methods) that can be implemented in Java? 回答1: There is a huge variety of algorithms for determining similarity in images. A search for image similarity algorithm and video similarity

Matlab calculate 3D similarity transformation. fitgeotrans for 3D

时光毁灭记忆、已成空白 提交于 2019-12-23 01:19:14
问题 How can I calculate in MatLab similarity transformation between 4 points in 3D? I can calculate transform matrix from T*X = Xp , but it will give me affine matrix due to small errors in points coordinates. How can I fit that matrix to similarity one? I need something like fitgeotrans , but in 3D Thanks 回答1: The answer by @rayryeng is correct, given that you have a set of up to 3 points in a 3-dimensional space. If you need to transform m points in n-dimensional space ( m>n ), then you first

compare arrays of two different lengths

倖福魔咒の 提交于 2019-12-22 18:38:12
问题 I am developing a program on Android that will compare the similarity of Gestures using Gesture Points. I have two arrays like this: gest_1 = [120,333,453,564,234,531] gest_2 = [222,432,11,234,223,344,534,523,432,234] I know there is no way to dynamically resize either one of the arrays, so is there any way for me to compare both these gestures using these arrays and return the similarity? Note that the data in the arrays are just randomly typed out. 回答1: You could try something like this:

Similarity between two data sets or arrays

不打扰是莪最后的温柔 提交于 2019-12-22 05:23:11
问题 Let's say I have a dataset that look like this: {A:1, B:3, C:6, D:6} I also have a list of other sets to compare my specific set: {A:1, B:3, C:6, D:6}, {A:2, B:3, C:6, D:6}, {A:99, B:3, C:6, D:6}, {A:5, B:1, C:6, D:9}, {A:4, B:2, C:2, D:6} My entries could be visualized as a Table (with four columns, A, B, C, D, and E). How can I find the set with the most similarity? For this example, row 1 is a perfect match and row 2 is a close second, while row 3 is quite far away. I am thinking of

Algorithm for finding similar images using an index

我的梦境 提交于 2019-12-22 04:01:54
问题 There are some surprisingly good image compare tools which find similar image even if it's not exactly the same (eg. change in size, wallpaper, brightness/contrast). I have some example applications here: Unique Filer 1.4 (shareware): https://web.archive.org/web/20010309014927/http://uniquefiler.com/ Fast Duplicate File Finder (Freeware): http://www.mindgems.com/products/Fast-Duplicate-File-Finder/Fast-Duplicate-File-Finder-About.htm Visual similarity duplicate image finder (payware): http:/

How do I group similar strings in R? [closed]

送分小仙女□ 提交于 2019-12-21 23:56:16
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 6 years ago . I have a database with ~5,000 locality names, most of which are repetitions with typos, permutations, abreviations, etc. I would like to group them by similarity, to speed up further processing. The best would be to convert each variation into a "platonic form", and put two columns side by side, with the

Is there an alternative to `difflib.get_close_matches()` that returns indexes (list positions) instead of a str list?

爱⌒轻易说出口 提交于 2019-12-21 19:48:43
问题 I want to use something like difflib.get_close_matches but instead of the most similar strings, I would like to obtain the indexes (i.e. position in the list). The indexes of the list are more flexible because one can relate the index to other data structures (related to the matched string). For example, instead of: >>> words = ['hello', 'Hallo', 'hi', 'house', 'key', 'screen', 'hallo', 'question', 'format'] >>> difflib.get_close_matches('Hello', words) ['hello', 'hallo', 'Hallo'] I would

Systematic threshold for cosine similarity with TF-IDF weights

ぃ、小莉子 提交于 2019-12-21 17:18:17
问题 I am running an analysis of several thousand (e.g., 10,000) text documents. I have computed TF-IDF weights and have a matrix with pairwise cosine similarities. I want to treat the documents as a graph to analyze various properties (e.g., the path length separating groups of documents) and to visualize the connections as a network. The problem is that there are too many similarities. Most are too small to be meaningful. I see many people dealing with this problem by dropping all similarities

Similar images - how to compare them

℡╲_俬逩灬. 提交于 2019-12-21 05:14:21
问题 I have over 1.3milion images that I have to compare with each other, and a few hundreds per day are added. My company take an image and create a version that can be utilized by our vendors. The files are often very similar to each other, for example two different companies can send us two different images, a JPG and a GIF, both with the McDonald Logo, with months between the submissions. What is happening is that at the end we find ourselves creating two different times the same logo when we