Inverting a real-valued index grid

前端 未结 7 925
南笙
南笙 2020-12-24 15:37

OpenCV\'s remap() uses a real-valued index grid to sample a grid of values from an image using bilinear interpolation, and returns the grid of samples as a new image.

<
7条回答
  •  萌比男神i
    2020-12-24 15:55

    OP here. I think I've found an answer. I haven't implemented it yet, and if someone comes up with a less fiddly solution (or finds something wrong with this one), I'll choose their answer instead.

    Problem statement

    Let A be the source image, B be the destination image, and M be the mapping from A's coords to B's coords, i.e.:

    B[k, l, :] == A(M[k, l, 0], M[k, l, 1], :) 
    for all k, l in B's coords.
    

    ...where square braces indicate array lookup with integer indices, and circular braces indicate bilinear interpolation lookup with floating-point indices. We restate the above using the more economical notation:

    B = A(M)
    

    We wish to find an inverse mapping N that maps B back to A as best as is possible:

    Find N s.t. A \approx B(N)
    

    The problem can be stated without reference to A or B:

    Find N = argmin_N || M(N) - I_n ||
    

    ...where ||*|| indicates the Frobenius norm, and I_n is the identity map with the same dimensions as N, i.e. a map where:

    I_n[i, j, :] == [i, j]
    for all i, j
    

    Naive solution

    If M's values are all integers, and M is an isomorphism, then you can construct N directly as:

    N[M[k, l, 0], M[k, l, 1], :] = [k, l]
    for all k, l
    

    Or in our simplified notation:

    N[M] = I_m
    

    ...where I_m is the identity map with the same dimensions as M.

    There are two problems:

    1. M is not an isomorphism, so the above will leave "holes" in N at N[i, j, :] for any [i, j] not among the values in M.
    2. M's values are floating-point coordinates [i, j], not integer coordinates. We cannot simply assign a value to the bilinearly-interpolated quantity N(i, j, :), for float-valued i, j. To achieve the equivalent effect, we must instead set the values of [i, j]'s four surrounding corners N[floor(i), floor(j), :], N[floor(i), ceil(j), :], N[ceil(i), floor(j), :], N[ceil(i), ceil(j), :] such that the interpolated value N(i, j, :) equals the desired value [k, l], for all pixel mappings [i, j] --> [k, l] in M.

    Solution

    Construct empty N as a 3D tensor of floats:

    N = zeros(size=(A.shape[0], A.shape[1], 2))
    

    For each coordinate [i, j] in A's coordinate space, do:

    1. Find the 2x2 grid of A-coordinates in M that [i, j] lies within. Compute the homography matrix H that maps those A-coordinates to their corresponding B-coordinates (given by the 2x2 grid's pixel indices).
    2. Set N[i, j, :] = matmul(H, [i, j])

    The potentially expensive step here would be the search in step 1 for the 2x2 grid of A-coordinates in M that encircles [i, j]. A brute-force search would make this whole algorithm O(n*m) where n is the number of pixels in A, and m the number of pixels in B.

    To reduce this to O(n), one could instead run a scanline algorithm within each A-coordinate quadrilateral to identify all the integer-valued coordinates [i, j] it contains. This could be precomputed as a hashmap that maps integer-valued A coords [i, j] to the upper-left corner of its encircling quadrilateral's B coords [k, l].

提交回复
热议问题