nearest-neighbor

LATERAL JOIN not using trigram index

ぃ、小莉子 提交于 2019-11-30 04:05:59
问题 I want to do some basic geocoding of addresses using Postgres. I have an address table that has around 1 million raw address strings: => \d addresses Table "public.addresses" Column | Type | Modifiers ---------+------+----------- address | text | I also have a table of location data: => \d locations Table "public.locations" Column | Type | Modifiers ------------+------+----------- id | text | country | text | postalcode | text | latitude | text | longitude | text | Most of the address strings

nearest neighbor - k-d tree - wikipedia proof

橙三吉。 提交于 2019-11-30 00:40:59
On the wikipedia entry for k-d trees , an algorithm is presented for doing a nearest neighbor search on a k-d tree. What I don't understand is the explanation of step 3.2. How do you know there isn't a closer point just because the difference between the splitting coordinate of the search point and the current node is greater than the difference between the splitting coordinate of the search point and the current best? Nearest neighbor search Animation of NN searching with a KD Tree in 2D The nearest neighbor (NN) algorithm aims to find the point in the tree which is nearest to a given input

Iterative Closest Point (ICP) implementation on python

瘦欲@ 提交于 2019-11-29 22:30:00
I have been searching for an implementation of the ICP algorithm in python lately with no result. According to wikipedia article http://en.wikipedia.org/wiki/Iterative_closest_point , the algorithm steps are: Associate points by the nearest neighbor criteria (for each point in one point cloud find the closest point in the second point cloud). Estimate transformation parameters (rotation and translation) using a mean square cost function (the transform would align best each point to its match found in the previous step). Transform the points using the estimated parameters. Iterate (re-associate

HTML5 Canvas Image Scaling Issue

余生颓废 提交于 2019-11-29 22:29:26
I am trying to make a pixel art themed game in HTML5 canvas, and as part of that I take 10x20 or so sized images and draw them onto the canvas with the following code: ctx.drawImage(image, 20, 20, 100, 200); However the canvas uses bicubic image scaling and hence the pixel art images look terrible at 2x and up. Is there a way to force canvas to use nearest neighbor scaling or possibly use a custom method to scale images? If not does that mean the images have to be scaled beforehand in something like Paint.net? Choose any one of the following: Via JavaScript: ctx.imageSmoothingEnabled = false;

Identifying points with the smallest Euclidean distance

北战南征 提交于 2019-11-29 18:42:51
问题 I have a collection of n dimensional points and I want to find which 2 are the closest. The best I could come up for 2 dimensions is: from numpy import * myArr = array( [[1, 2], [3, 4], [5, 6], [7, 8]] ) n = myArr.shape[0] cross = [[sum( ( myArr[i] - myArr[j] ) ** 2 ), i, j] for i in xrange( n ) for j in xrange( n ) if i != j ] print min( cross ) which gives [8, 0, 1] But this is too slow for large arrays. What kind of optimisation can I apply to it? RELATED: Euclidean distance between points

Search in locality sensitive hashing

二次信任 提交于 2019-11-29 15:42:53
I'm trying to understand the section 5. of this paper about LSH, in particular how to bucket the generated hashes. Quoting the linked paper: Given bit vectors consisting of d bits each, we choose N = O(n 1/(1+epsilon) ) random permutations of the bits. For each random permutation σ, we maintain a sorted order O σ of the bit vectors, in lexicographic order of the bits permuted by σ. Given a query bit vector q, we find the approximate nearest neighbor by doing the following: For each permu- tation σ, we perform a binary search on O σ to locate the two bit vectors closest to q (in the

Two sets of high dimensional points: Find the nearest neighbour in the other set

怎甘沉沦 提交于 2019-11-29 12:28:55
I have 2 sets: A and B. Both sets contain the same number of high dimensional points. How do I find the nearest neighbour in Set A for every point in Set B? I thought about using a Voronoi diagram but it seems (according to wikipedia) that it is not suitable for dimensions higher than 2. Can someone suggest a method to me, please? gsamaras FLANN If your data do really lie in a high dimensional space, then you could use FLANN . It actually builds a number of rotated kd-trees and queries (a bit) every single tree, keeping the best results found. It also rotates the data-set to avoid nasty cases.

How to find the previous and next record using a single query in MySQL?

末鹿安然 提交于 2019-11-29 06:52:34
I have a database, and I want to find out the previous and next record ordered by ID, using a single query. I tried to do a union but that does not work. :( SELECT * FROM table WHERE `id` > 1556 LIMIT 1 UNION SELECT * FROM table WHERE `id` <1556 ORDER BY `product_id` LIMIT 1 Any ideas? Thanks a lot. You need to change up your ORDER BY : SELECT * FROM table WHERE `id` > 1556 ORDER BY `id` ASC LIMIT 1 UNION SELECT * FROM table WHERE `id` < 1556 ORDER BY `id` DESC LIMIT 1 This ensures that the id field is in the correct order before taking the top result. You can also use MIN and MAX: SELECT *

Data structure for fast line queries?

雨燕双飞 提交于 2019-11-29 05:15:43
I know that I can use a KD-Tree to store points and iterate quickly over a fraction of them that are close to another given point. I'm wondering whether there is something similar for lines. Given a set of lines L in 3D (to be stored in that data structure) and another "query line" q, I'd like to be able to quickly iterate through all lines in L that "are close enough" to q. The distance I'm planning to use is the minimal Euclidean distance between two points u and v where u is some point on the first line and v is some point on the second line. Computing that distance is not a problem (there

Binary features and Locality Sensitive Hashing (LSH)

*爱你&永不变心* 提交于 2019-11-29 04:26:00
I am studying FLANN, a library for approximate nearest neighbors search. For the LSH method they represent an object (point in search space), as an array of unsigned int. I am not sure why they do this, and not represent a point simply as a double array (which would represent a point in multi-dimensional vector space). Maybe because LSH is used for binary features? Can someone share more about the possible use of unsigned int in this case? Why unsigned int if you only need a 0 and 1 for each feature? Thanks deltheil Please note that I will refer to the latest FLANN release, i.e. flann-1.8.3 at