Playing perfect Tetris: how to align and scale two curves using scaling and translation?

Given parameters of scaling on the y axis (s) and translation on the x axis (t), how to scale and align two curves that do not coincide when the purpose is to maximize curve superposition (not minimize distance)?

As indicated by @DWin, this could be rebranded "How to play Tetris perfectly with R", though it does have applications far beyond winning a Tetris game.

A variation of this question could involve any number of rigid body transformations (rotation, translation and scaling).

Given curve 1

curve1<-data.frame(x=c(1,1,2,2,3),
                   y=c(9,6,6,3,3))

with(curve1, plot(x=x, y=y, type="l", xlim=c(0,10), ylim=c(0,10)))

and curve 2

curve2<-data.frame(x=c(4,5,5,6,6,7),
                   y=c(2,2,1,1,2,3))

with(curve2, plot(x=x, y=y, type="l", xlim=c(0,10), ylim=c(0,10)))

I wish to find s and t that maximize superposition between the two curves.

Ideally the method would be in R using optim.

In this made up example, t=3 and s=1/3, so that

t=3
s=1/3

with(curve2, plot(x=x, y=y, type="l", xlim=c(0,10), ylim=c(0,10)))

with(curve1, lines(x=x+t, y=y*s, col="red"))

Note that to obtain such a fitting, regions that can have a consensus must have a higher weight on parametrization than regions that can not be superimposed and that the larger the consensus region the higher the weight.

Trails I have been exploring:

minimize area between curves

algorithm for superposing two curves

shape analysis using the shapes package

Iterative closest point (ICP) if someone has implemented it in R or can port one of the C implementations

Bonus points for a method using maximum likelihood (assuming normal distribution of error).

42-

This will return the distances between the points when curve1 is scaled on the y-axis by a factor "tfac" and moved on the x-axis by an amount "s":

 as.matrix( dist( rbind(as.matrix(curve2), 
                 ( matrix(c(rep(s, 5), rep(1,5)), ncol=2) +   # x-shift matrix
                                       as.matrix(curve1) ) %*% 
                   matrix(c( 1, 0, 0, tfac),ncol=2) ) ) # the y-scaling matrix
                )[ 
                           # better not to use 't' as a variable name
      -(1:5), -(6:11)]  # easier to return the relevant distances when in matrix

It should be a simple matter to put that in a function to be minimized:

 dfunc <- function(C1, C2, s, tfac) { sum( .... ) }

I'm not absolutely sure this will return the result you are expecting since the objective function you are implying may not be the sum of distances. You may need to turn to integer programming methods. The Optimization CRAN Task View would be a good place to look for those methods in R. I suppose an alternative if this problem arises might be to round the "s" value and scale only to the nearest power of 2.

dfunc <- function(param, D1=curve1, D2=curve2) { 
           sum( as.matrix(dist( 
                   rbind(as.matrix(D2), 
                         ( matrix(c(rep(param[1], 5), rep(1,5)), ncol=2) + 
                           as.matrix(D1) ) %*% 
                   matrix(c(1,0,0,param[2]),ncol=2) ) ) )[-(1:5), -(6:11)])}
optim(c(1,1), dfunc)
#$par
$[1] 3.3733977 0.2243866
# trimmed output

Using these values one obtains the following superposition:

I might, therefore, try s=3, tfac=0.25. (I see looking back that I switched the roles of t and s from what you asked for. Sorry.)

Okay, here's an attempt at a solution.

The basic trick is this: we rasterise the two curves, and then we can do the curve comparison on a tile by tile basis. This seems like a fairly reasonable way of comparing the curve superposition, at least. To encourage the optimiser to approach the curve, we also introduce a loss that penalises curves too far away from the other.

No guarantees this works for more complicated curves and transformations, but it's an idea, at least.

 curve2<-data.frame(x=c(4,5,5,6,6,7),
                    y=c(2,2,1,1,2,3))
 fillin <- function(ax, ay, bx, by, scaling= 10, steps= 100) floor(cbind(seq(from = ax, to = bx, len = steps), seq(from = ay, to = by, len = steps)) * scaling)
 Bmat <- matrix(0, 100, 100)
 for (i in 2:nrow(curve2)){
 Bmat[fillin (curve2[i-1,1], curve2[i-1,2], curve2[i,1], curve2[i,2])] =1
 }
 Bmat.orig = Bmat

 Bmat = Bmat.orig
 #construct utility function based on 
 #manhattan distances to closest point?
 shift = function(mat, offset){
 mat0 = array(0, dim = dim(mat)+2)
 mat0[1:nrow(mat) +1+ offset[1] , 1:ncol(mat) + 1+offset[2]] = mat
 return(mat0[1:nrow(mat) + 1, 1:ncol(mat) + 1])
 }

 for (i in 1:100){
 Bm = (Bmat != 0)
 Btmp1 = shift(Bm, c(1,0))
 Btmp2 = shift(Bm, c(-1,0))
 Btmp3 = shift(Bm, c(0,1))
 Btmp4 = shift(Bm, c(0,-1))

 Bmat = Bmat + pmax(Bm ,Btmp1, Btmp2, Btmp3, Btmp4)/i
 }

 Bmat2 = replace(Bmat, Bmat == max(Bmat), max(Bmat) + 10)

 #construct and compare rasterised versions
 getcurve = function(trans = c(0,1),  curve=data.frame(x=c(1,1,2,2,3) ,
                    y=c(9,6,6,3,3) ), Bmat = Bmat2){
 Amat = array(0, dim = dim(Bmat))
 curve[,1] = curve[,1] + trans[1]
 curve[,2] = curve[,2] * trans[2]
 for (i in 2:nrow(curve)){
 fillin (curve[i-1,1], curve[i-1,2], curve[i,1], curve[i,2]) -> ind
 if (min(ind) < 1 || max(ind) > nrow(Bmat)) return( array(-1, dim= dim(Bmat)))
 Amat[ind] =1
 }
 Amat = (Amat - mean(Amat))/sd(as.vector(Amat))
 Amat
 }
 compcurve = function(trans = c(0,1), curve=data.frame(x=c(1,1,2,2,3) ,
                    y=c(9,6,6,3,3) ) , Bmat = Bmat2){
 Amat = getcurve(trans, curve, Bmat)
 -sum(Amat * Bmat)
 }
 #SANN seems to work for this, but is slow. Beware of finite differencing
 # - criterion is non-smooth! 
 optim(c(0,1), compcurve, method = "SANN", Bmat = Bmat2) -> output
 image(Bmat)
 contour(getcurve(output$par), add = T)

Result:

Not too shabby, maybe?

You will possibly have to fudge the specifics of the rasterisation to work for other problems. And you might wanna tune how the optim is done.

A 'smarter' alternative is to note that for an optimal solution, it's probably the case that at least one pair of vertices will conincide. This can give you a better search strategy. The advantage of the rasterisation scheme in comparison to area-between curves is that it's possibly more flexible wrt to different transformations and non-graphs (in particular, the vertical line in your first curve is a problem.) You can potentially avoid rasterisation by an appropriate computation, but it gives me a headache just to think about it.

Since we are maximising a criterion, then this is a maximum likelihood method. Curiously, I think it is actually impossible to phrase this question as a maximum likelihood problem using normal errors, because normal errors imply L2 based criterions, which will famously not give you exact superpositions.

来源：https://stackoverflow.com/questions/10826625/playing-perfect-tetris-how-to-align-and-scale-two-curves-using-scaling-and-tran

标签

geometry

curve-fitting

curve