In numpy, how to create an array of the indices of the elements in a source array as they are found in a destination array?

问题

I'm sure this question has been answered somewhere, but I just can't find the words to look for it.

I have these two arrays:

import numpy as np

src = np.array([[8, 1],
                [2, 4]]) 

dst = np.array([[1, 4],
                [8, 2]])

I would like to get this array:

indices = (np.array([[1, 0],
                     [1, 0]]),
           np.array([[0, 0],
                     [1, 1]]))

Such that dst[indices] gets me src.

Any ideas? Moreover, what is the kind of operation that I'm looking for called? So that I can search more about it by myself in the future.

回答1:

Here is what I believe is the "direct" way:

# find order of src and dst
so = src.ravel().argsort()
do = dst.ravel().argsort()
# allocate combined map
tot = np.empty_like(src)

# next line is all you need to remember
tot.ravel()[so] = do

# go back to 2D indexing
indices = np.unravel_index(tot,dst.shape)

# check
dst[indices]
# array([[8, 1],
#        [2, 4]])

indices
# (array([[1, 0],
#         [1, 0]]), array([[0, 0],
#         [1, 1]]))

回答2:

Tricky.

Unfortunately, numpy doesn't offer a way of anwering the question "Which indices should I use to transform one array into the other?"

It does, however, offer us a way to answer the question "Which indices should I use to transform an array into its sorted version?" in the form of argsort(), and we can exploit that. (inspired by @Mad_Physicist comment)

TL;DR

flat_row_indices, flat_column_indices = np.unravel_index(np.ravel(dst).argsort()[np.ravel(src).argsort().argsort()], src.shape)
indices = (flat_row_indices.reshape(src.shape), flat_column_indices.reshape(src.shape))

Explanation

The idea is to find the indices that transform dst to its sorted version (say s) and find the indices that transform s into src, and then compose those indices.

First, note that the shape of src and dst don't really matter, so I will first flatten both arrays with np.ravel().

The first one is simple:

dst_to_s = np.ravel(dst).argsort()  # Now `np.ravel(dst)[dst_to_s]` equals `s`

For the second, we can use a little trick:

s_to_src = np.ravel(src).argsort().argsort()  # Now `s[s_to_src]` equals `np.ravel(src)`

Combining them gives us a flattened version of what you're looking for:

dst_to_src = dst_to_s[s_to_src]  # Now `np.ravel(dst)[dst_to_src]` equals `np.ravel(src)`

Now it's only a matter of reshaping this answer:

flat_row_indices, flat_column_indices = np.unravel_index(dst_to_src, src.shape)
indices = (flat_row_indices.reshape(src.shape), flat_column_indices.reshape(src.shape))

Try it out:

np.all(dst[indices] == src)  # Returns `True`

回答3:

To get the position of every element in dst in the src indices space you can do the following:

import numpy as np

src = np.array([[8, 1],
                [2, 4]])

dst = np.array([[1, 4],
                [8, 2]])
in = np.array([np.where(x==src)[:] for x in np.nditer(dst)]).squeeze()

and in is:

[[0 1]
 [1 1]
 [0 0]
 [1 0]]

Where the first row means that 1 should be in row number 0 and column number 1 etc.

回答4:

My mind was getting completely overflowed while trying these argsorts and their inverses but then I realized that introducing some notations of permutations might help here.

The underlying part of this problem would be this sub-problem:

Let p be a permutation array such that x[p] = y. Let q be a permutation array such that y[q] = z. What is a permutation r such that x[r] = z?

And the answer is np.arange(len(x))[p][q]. Now apply this conclusion for case x=src, y=sorted(x), z=dst:

def find_permutation(src, dst):
    p = np.argsort(src)
    _, q = np.unique(dst, return_inverse=True)
    return np.arange(len(src))[p][q]

So it gives, for instance:

>>> find_permutation([1, 4, 8, 2], [8, 1, 2, 4])
[2 0 3 1]

A solution would be to find a backward permutation of flattened arrays and then reshape them back into initial structure:

src = np.array([[8, 1], [2, 4]])
dst = np.array([[1, 4], [8, 2]])
p = find_permutation(dst.flatten(), src.flatten())
x,y = src.shape
idx1, idx2 = np.divmod(p, y)
output = idx1.reshape(x,y), idx2.reshape(x,y)

Output:

(array([[1, 0], [1, 0]], dtype=int32), 
array([[0, 0], [1, 1]], dtype=int32))

来源：https://stackoverflow.com/questions/64279578/in-numpy-how-to-create-an-array-of-the-indices-of-the-elements-in-a-source-arra

标签

python

numpy