Approximate matching of two lists of events (with duration)

可紊 提交于 2019-12-04 09:20:48

Here's a quadratic-time algorithm that gives a maximum likelihood estimate with respect to the following model. Let A1 < ... < Am be the true intervals and let B1 < ... < Bn be the reported intervals. The quantity sub(i, j) is the log-likelihood that Ai becomes Bj. The quantity del(i) is the log-likelihood that Ai is deleted. The quantity ins(j) is the log-likelihood that Bj is inserted. Make independence assumptions everywhere! I'm going to choose sub, del, and ins so that, for every i < i' and every j < j', we have

sub(i, j') + sub(i', j) <= max {sub(i, j )       + sub(i', j')
                               ,del(i) + ins(j') + sub(i', j )
                               ,sub(i, j')       + del(i') + ins(j)
                               }.

This ensures that the optimal matching between intervals is noncrossing and thus that we can use the following Levenshtein-like dynamic program.

The dynamic program is presented as a memoized recursive function, score(i, j), that computes the optimal score of matching A1, ..., Ai with B1, ..., Bj. The root of the call tree is score(m, n). It can be modified to return the sequence of sub(i, j) operations in the optimal solution.

score(i, j) | i == 0 && j == 0 =      0
            | i >  0 && j == 0 =      del(i)    + score(i - 1, 0    )
            | i == 0 && j >  0 =      ins(j)    + score(0    , j - 1)
            | i >  0 && j >  0 = max {sub(i, j) + score(i - 1, j - 1)
                                     ,del(i)    + score(i - 1, j    )
                                     ,ins(j)    + score(i    , j - 1)
                                     }

Here are some possible definitions for sub, del, and ins. I'm not sure if they will be any good; you may want to multiply their values by constants or use powers other than 2. If Ai = [s, t] and Bj = [u, v], then define

sub(i, j) = -(|u - s|^2 + |v - t|^2)
del(i) = -(t - s)^2
ins(j) = -(v - u)^2.

(Apologies to the undoubtedly extant academic who published something like this in the bioinformatics literature many decades ago.)

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!