generate 3-d random points with minimum distance between each of them?

て烟熏妆下的殇ゞ 提交于 2021-01-27 06:31:37

问题


there. I am going to generate 10^6 random points in matlab with this particular characters. the points should be inside a sphere with radious 25, the are 3-D so we have x, y, z or r, theta, phi. there is a minimum distance between each points. first, i decided to generate points and then check the distances, then omit points with do not have these condition. but, it may omit many of points. another way is to use RSA (Random Sequential Addition), it means generate points one by one with this minimum distance between them. for example generate first point, then generate second randomly out of the minimum distance from point 1. and go on till achieving 10^6 points. but it takes lots of time and i can not reach 10^6 points, since the speed of searching appropriate position for new points will take long time.

Right now I am using this program:

Nmax=10000; 
R=25; 
P=rand(1,3); 
k=1; 
while k<Nmax 
theta=2*pi*rand(1); 
phi=pi*rand(1); 
r = R*sqrt(rand(1)); 
% convert to cartesian 
x=r.*sin(theta).*cos(phi); 
y=r.*sin(theta).*sin(phi); 
z=r.*cos(theta); 
P1=[x y z]; 
r=sqrt((x-0)^2+(y-0)^2+(z-0)^2); 
D = pdist2(P1,P,'euclidean'); 
% euclidean distance 

    if D>0.146*r^(2/3) 
        P=[P;P1]; 
        k=k+1;
    end 
    i=i+1; 
end 
x=P(:,1);y=P(:,2);z=P(:,3); plot3(x,y,z,'.');

How can I efficiently generate points by these condition? thank you.


回答1:


I took a closer look at your algorithm, and concluded there is NO WAY it will ever work - at least not if you really want to get a million points in that sphere. There is a simple picture that explains why not - this is a plot of the number of points that you need to test (using your technique of RSA) to get one additional "good" point. As you can see, this goes asymptotic at just a few thousand points (I ran a slightly faster algorithm against 200k points to produce this):

graph of points tested

I don't know if you ever tried to compute the theoretical number of points you could fit in your sphere when you have them perfectly arranged, but I'm beginning to suspect the number is a good deal smaller than 1E6.

The complete code I used to investigate this, plus the output it generated, can be found here. I never got as far as the technique I described in my earlier answer... there was just too much else going on in the setup you described.

EDIT: I started to think it might not be possible, even with "perfect" arrangement, to get to 1M points. I made a simple model for myself as follows:

Imagine you start on the "outer shell" (r=25), and try to fit points at equal distances. If you divide the area of the "shell" by the area of one "exclusion disk" (of radius r_sub_crit), you get a (high) estimate of the number of points at that distance:

numpoints = 4*pi*r^2 / (pi*(0.146 * r^(2/3))^2) ~ 188 * r^(2/3)

The next "shell" in should be at a radius that is 0.146*r^(2/3) less - but if you think of the points as being very carefully arranged, you might be able to get a tiny bit closer. Again, let's be generous and say the shells can be just 1/sqrt(3) closer than the criteria. You can then start at the outer shell and work your way in, using a simple python script:

import scipy as sc
r = 25
npts = 0
def rc(r):
  return 0.146*sc.power(r, 2./3.)
while (r > rc(r)):  
  morePts = sc.floor(4/(0.146*0.146)*sc.power(r, 2./3.))
  npts = npts + morePts
  print morePts, ' more points at r = ', r
  r = r - rc(r)/sc.sqrt(3)

print 'total number of points fitted in sphere: ', npts

The output of this is:

1604.0  more points at r =  25
1573.0  more points at r =  24.2793037966
1542.0  more points at r =  23.5725257555
1512.0  more points at r =  22.8795314897
1482.0  more points at r =  22.2001865995
1452.0  more points at r =  21.5343566722
1422.0  more points at r =  20.8819072818
1393.0  more points at r =  20.2427039885
1364.0  more points at r =  19.6166123391
1336.0  more points at r =  19.0034978659
1308.0  more points at r =  18.4032260869
1280.0  more points at r =  17.8156625053
1252.0  more points at r =  17.2406726094
1224.0  more points at r =  16.6781218719
1197.0  more points at r =  16.1278757499
1171.0  more points at r =  15.5897996844
1144.0  more points at r =  15.0637590998
1118.0  more points at r =  14.549619404
1092.0  more points at r =  14.0472459873
1066.0  more points at r =  13.5565042228
1041.0  more points at r =  13.0772594652
1016.0  more points at r =  12.6093770509
991.0  more points at r =  12.1527222975
967.0  more points at r =  11.707160503
943.0  more points at r =  11.2725569457
919.0  more points at r =  10.8487768835
896.0  more points at r =  10.4356855535
872.0  more points at r =  10.0331481711
850.0  more points at r =  9.64102993012
827.0  more points at r =  9.25919600154
805.0  more points at r =  8.88751153329
783.0  more points at r =  8.52584164948
761.0  more points at r =  8.17405144976
740.0  more points at r =  7.83200600865
718.0  more points at r =  7.49957037478
698.0  more points at r =  7.17660957023
677.0  more points at r =  6.86298858965
657.0  more points at r =  6.55857239952
637.0  more points at r =  6.26322593726
618.0  more points at r =  5.97681411037
598.0  more points at r =  5.69920179546
579.0  more points at r =  5.43025383729
561.0  more points at r =  5.16983504778
542.0  more points at r =  4.91781020487
524.0  more points at r =  4.67404405146
506.0  more points at r =  4.43840129415
489.0  more points at r =  4.21074660206
472.0  more points at r =  3.9909446055
455.0  more points at r =  3.77885989456
438.0  more points at r =  3.57435701766
422.0  more points at r =  3.37730048004
406.0  more points at r =  3.1875547421
390.0  more points at r =  3.00498421767
375.0  more points at r =  2.82945327223
360.0  more points at r =  2.66082622092
345.0  more points at r =  2.49896732654
331.0  more points at r =  2.34374079733
316.0  more points at r =  2.19501078464
303.0  more points at r =  2.05264138052
289.0  more points at r =  1.91649661498
276.0  more points at r =  1.78644045325
263.0  more points at r =  1.66233679273
250.0  more points at r =  1.54404945973
238.0  more points at r =  1.43144220603
226.0  more points at r =  1.32437870508
214.0  more points at r =  1.22272254805
203.0  more points at r =  1.1263372394
192.0  more points at r =  1.03508619218
181.0  more points at r =  0.94883272297
170.0  more points at r =  0.867440046252
160.0  more points at r =  0.790771268402
150.0  more points at r =  0.718689381062
140.0  more points at r =  0.65105725389
131.0  more points at r =  0.587737626612
122.0  more points at r =  0.528593100237
113.0  more points at r =  0.473486127367
105.0  more points at r =  0.422279001431
97.0  more points at r =  0.374833844693
89.0  more points at r =  0.331012594847
82.0  more points at r =  0.290676989951
75.0  more points at r =  0.253688551418
68.0  more points at r =  0.219908564725
61.0  more points at r =  0.189198057381
55.0  more points at r =  0.161417773651
49.0  more points at r =  0.136428145311
44.0  more points at r =  0.114089257597
38.0  more points at r =  0.0942608092113
33.0  more points at r =  0.0768020649149
29.0  more points at r =  0.0615717987589
24.0  more points at r =  0.0484282253244
20.0  more points at r =  0.0372289153633
17.0  more points at r =  0.0278306908104
13.0  more points at r =  0.0200894920319
10.0  more points at r =  0.013860207063
8.0  more points at r =  0.00899644813842
5.0  more points at r =  0.00535025545232 

total number of points fitted in sphere:  55600.0

This seems to confirm that you really can't get to a million, no matter how you try...




回答2:


There are many things you could do to improve your program - both algorithm, and code.

On the code side, one of the things that is REALLY slowing you down is the fact that not only you use a for loop (which is slow), but in the line

P = [P;P1];

you append elements to an array. Every time that happens, Matlab needs to find a new place to put the data, copying all the points in the process. This quickly becomes very slow. Preallocating the array with

P = zeros(1000000, 3);

keeping track of the number N of points you have found so far, and changing your calculation of distance to

D = pdist2(P1, P(1:N, :), 'euclidean');

would at least address that...

The other issue is that you check new points against all previously found points - so when you have 100 points, you check about 100x100, for 1000 it is 1000x1000. You can see then that this algorithm is O(N^3) at least... not counting the fact that you will get more "misses" as the density goes up. A O(N^3) algorithm with N=10E6 takes at least 10E18 cycles; if you had a 4 GHz machine with one clock cycle per comparison, you would need 2.5E9 seconds = approximately 80 years. You can try parallel processing, but that's just brute force - who wants that?

I recommend that you think about breaking the problem into smaller pieces (quite literally): for example, if you divide your sphere into little boxes that are about the size of your maximum distance, and for each box you keep track of what points are in it, then you only need to check against points in "this" box and its immediate neighbors - 27 boxes in all. If your boxes are 2.5 mm across, you would have 100x100x100 = 1M boxes. That seems like a lot, but now your computation time will be reduced drastically, as you will have (by the end of the algorithm) only 1 point on average per box... Of course with the distance criterion you are using, you will have more points near the center, but that's a detail.

The data structure you would need would be a cell array of 100x100x100, and each cell contains the index of the good points found so far "in that cell". The problem with a cell array is that it doesn't lend itself to vectorization. If instead you have the memory, you could assign it as a 4D array of 10x100x100x100, assuming you will have no more than 10 points per cell (if you do, you will have to handle that separately; work with me here...). Use an index of -1 for points not yet found

Then your check would be something like this:

% initializing:
bigList = zeros(10,102,102,102)-1; % avoid hitting the edge...
NPlist = zeros(102, 102, 102); % track # valid points in each box
bottomcorner = [-25.5, -25.5, -25.5]; % boxes span from -25.5 to +25.5
cellSize = 0.5;

.

% in your loop:
P1= [x, y, z];
cellCoords = ceil(P1/cellSize);
goodFlag = true;
pointsSoFar = bigList(:, cellCoords(1)+(-1:1), cellCoords(2)+(-1:1), cellCoords(3)+(-1:1));
pointsToCheck = find(pointsSoFar>0); % this is where the big gains come...
r=sum(P1.^2);
D = pdist2(P1,P(pointsToCheck, :),'euclidean'); % euclidean distance 

if D>0.146*r^(2/3) 
    P(k,:) = P1;
    % check the syntax of this line...
    cci = ind2sub([102 102 102], cellCoords(1), cellCoords(2), cellCoords(3));
    NP(cci)=NP(cci)+1; % increasing number of points in this box
    % you want to handle the case where this > 10!!!
    bigList(NP(cci), cci) = k;
    k=k+1;
end
....

I don't know if you can take it from here; if you can't, say so in the notes and I may have some time this weekend to code this up in more detail. There are ways to speed it up more with some vectorization, but it quickly becomes hard to manage.

I think that putting a larger number of points randomly in space, and then using the above for a giant vectorized culling, may be the way to go. But I recommend to take little steps first... if you can get the above to work at all well, you can then optimize further (array size, etc).




回答3:


I found the reference - "Simulated Brain Tumor Growth Dynamics Using a Three-Dimensional Cellular Automaton", Ansal et al (2000).

I agree it is puzzling - until you realize one important thing. They are reporting their results in mm, but your code was written in cm. While that may seem insignificant, the formula for "critical radius", rc = 0.146r^(2/3) includes a constant, 0.146, that is dimensional - the dimensions are mm^(1/3), not cm^(1/3).

When I make that change in my python code to evaluate the number of possible lattice sites, it jumps by a factor 10. Now they claimed that they were using a "jamming limit" of 0.38 - the number where you really cannot find any more sites. If you include that limit, I predict no more than 200k points could be found - still short of their 1.5M, but not quite so crazy.

You might consider contacting the authors to discuss this with them? If you want to include me in the conversation, you can email me at: SO (just two letters) at my handle name dot united states. Same domain as where I posted links above...



来源:https://stackoverflow.com/questions/14780328/generate-3-d-random-points-with-minimum-distance-between-each-of-them

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!