问题
I want to create a large set of random point cloud in 2D plane that are non-degenerate (no 3 points in a straight line in the whole set). I have a naive solution which generates a random float pair P_new(x,y) and checks with every pair of points (P1, P2) generated till now if point (P1, P2, P) lie in same line or not. This takes O(n^2) checks for each new point added to the list making the whole complexity O(n^3) which is very slow if I want to generate more than 4000 points (takes more than 40 mins). Is there a faster way to generate these set of non-degenerate points?
回答1:
Instead of checking the possible points collinearity on each cycle iteration, you could compute and compare coefficients of linear equations. This coefficients should be store in container with quick search. I consider using std::set, but unordered_map could fit either and could lead to even better results.
To sum it up, I suggest the following algorithm:
- Generate random point
p
; - Compute coefficients of lines crossing
p
and existing points (I mean usualA
,B
&C
). Here you need to don
computations; - Trying to find newly computed values inside of previously computed set. This step requires
n*log(n^2)
operations at maximum. - In case of negative search result, add new value and add its coefficients to corresponding sets. Its cost is about
O(log(n))
too.
The whole complexity is reduced to O(n^2*log(n))
.
This algorithm requires additional storing of n^2*sizeof(Coefficient)
memory. But this seems to be ok if you are trying to compute 4000 points only.
回答2:
O(n^2 log n) algorithm can be easily constructed in the following way:
For each point P in the set:
Sort other points by polar angle (cross-product as a comparison function, standard idea, see 2D convex hull gift-wrapping algorithm for example). In this step you should consider only points Q that satisfy
Q.x > P.x || Q.y >= P.y
Iterate over sorted list, equal points are lying on the same line.
Sorting is done in O(n log n), step 2. is O(n). This gives O(n^2 log n) for removing degenerate points.
回答3:
Determining whether a set of points is degenerate is a 3SUM-hard problem. (The very first problem listed is determining whether three lines contains a common point; the equivalent problem under projective duality is whether three points belong to a common line.) As such, it's not reasonable to hope that a generate-and-test solution will be significantly faster than n2.
What are your requirements for the distribution?
回答4:
generate random point Q
for previous points P calculate
(dx, dy) = P - Q
and
B = (asb(dx) > abs(dy) ? dy/dx : dx/dy)
sort the list of points P by its B value, so that points that form a line with Q will be in near positions inside the sorted list.
walk over the sorted list checking where Q forms a line with the current P value being considered and some next values that are nearer than a given distance.
Perl implementation:
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use Math::Vector::Real;
use Math::Vector::Real::Random;
use Sort::Key::Radix qw(nkeysort);
use constant PI => 3.14159265358979323846264338327950288419716939937510;
@ARGV <= 2 or die "Usage:\n $0 [n_points [tolerance]]\n\n";
my $n_points = shift // 4000;
my $tolerance = shift // 0.01;
$tolerance = $tolerance * PI / 180;
my $tolerance_arctan = 3 / 2 * $tolerance;
# I got to that relation using no so basic maths in a hurry.
# it may be wrong!
my $tolerance_sin2 = sin($tolerance) ** 2;
sub cross2d {
my ($p0, $p1) = @_;
$p0->[0] * $p1->[1] - $p1->[0] * $p0->[1];
}
sub line_p {
my ($p0, $p1, $p2) = @_;
my $a0 = $p0->abs2 || return 1;
my $a1 = $p1->abs2 || return 1;
my $a2 = $p2->abs2 || return 1;
my $cr01 = cross2d($p0, $p1);
my $cr12 = cross2d($p1, $p2);
my $cr20 = cross2d($p2, $p0);
$cr01 * $cr01 / ($a0 * $a1) < $tolerance_sin2 or return;
$cr12 * $cr12 / ($a1 * $a2) < $tolerance_sin2 or return;
$cr20 * $cr20 / ($a2 * $a0) < $tolerance_sin2 or return;
return 1;
}
my ($c, $f1, $f2, $f3) = (0, 1, 1, 1);
my @p;
GEN: for (1..$n_points) {
my $q = Math::Vector::Real->random_normal(2);
$c++;
$f1 += @p;
my @B = map {
my ($dx, $dy) = @{$_ - $q};
abs($dy) > abs($dx) ? $dx / $dy : $dy / $dx;
} @p;
my @six = nkeysort { $B[$_] } 0..$#B;
for my $i (0..$#six) {
my $B0 = $B[$six[$i]];
my $pi = $p[$six[$i]];
for my $j ($i + 1..$#six) {
last if $B[$six[$j]] - $B0 > $tolerance_arctan;
$f2++;
my $pj = $p[$six[$j]];
if (line_p($q - $pi, $q - $pj, $pi - $pj)) {
$f3++;
say "BAD: $q $pi-$pj";
redo GEN;
}
}
}
push @p, $q;
say "GOOD: $q";
my $good = @p;
my $ratiogood = $good/$c;
my $ratio12 = $f2/$f1;
my $ratio23 = $f3/$f2;
print STDERR "gen: $c, good: $good, good/gen: $ratiogood, f2/f1: $ratio12, f3/f2: $ratio23 \r";
}
print STDERR "\n";
The tolerance indicates the acceptable error in degrees when considering if three points are in a line as π - max_angle(Q, Pi, Pj)
.
It does not take into account the numerical instabilities that can happen when subtracting vectors (i.e |Pi-Pj|
may be several orders of magnitude smaller than |Pi|
). An easy way to eliminate that problem would be to also require a minimum distance between any two given points.
Setting tolerance to 1e-6, the program just takes a few seconds to generate 4000 points. Translating it to C/C++ would probably make it two orders of magnitude faster.
回答5:
O(n) solution:
- Pick a random number r from 0..1
- The point added to the cloud is then
P(cos(2 × π × r), sin(2 × π × r))
来源:https://stackoverflow.com/questions/6825772/generate-non-degenerate-point-set-in-2d-c