Computing euclidean distance with multiple list in python

问题

I'm writing a simple program to compute the euclidean distances between multiple lists using python. This is the code I have so fat

import math
euclidean = 0
euclidean_list = []
euclidean_list_complete = []

test1 = [[0.0, 0.0, 0.0, 152.0, 12.29], [0.0, 0.0, 0.357, 245.0, 10.4], [0.0, 0.0, 0.10, 200.0, 11.0]]

test2 = [[0.0, 0.0, 0.0, 72.0, 12.9], [0.0, 0.0, 0.0, 80.0, 11.3]]

for i in range(len(test2)):
    for j in range(len(test1)):
        for k in range(len(test1[0])):
            euclidean += pow((test2[i][k]-test1[j][k]),2)

        euclidean_list.append(math.sqrt(euclidean))
        euclidean = 0

    euclidean_list_complete.append(euclidean_list)


print euclidean_list_complete

my problem with this code is it doesn't print the output i want properly. The output should be [[80.0023, 173.018, 128.014], [72.006, 165.002, 120.000]]

but instead, it prints

[[80.00232559119766, 173.01843095173416, 128.01413984400315, 72.00680592832875, 165.0028407300917, 120.00041666594329], [80.00232559119766, 173.01843095173416, 128.01413984400315, 72.00680592832875, 165.0028407300917, 120.00041666594329]]

I'm guessing it has something to do with the loop. What should I do to fix it? By the way, I don't want to use numpy or scipy for studying purposes

If it's unclear, I want to calculate the distance between lists on test2 to each lists on test1

回答1:

The question has partly been answered by @Evgeny. The answer the OP posted to his own question is an example how to not write Python code. Here is a shorter, faster and more readable solution, given test1 and test2 are lists like in the question:

def euclidean(v1, v2):
    return sum((p-q)**2 for p, q in zip(v1, v2)) ** .5

d2 = []
for i in test2:
    foo = [euclidean(i, j) for j in test1]
    d2.append(foo)


print(d2)
#[[80.00232559119766, 173.01843095173416, 128.01413984400315],
# [72.00680592832875, 165.0028407300917, 120.00041666594329]]

回答2:

Not sure what you are trying to achieve for 3 vectors, but for two the code has to be much, much simplier:

test2 = [[0.0, 0.0, 0.0, 72.0, 12.9], [0.0, 0.0, 0.0, 80.0, 11.3]]

def distance(list1, list2):
    """Distance between two vectors."""
    squares = [(p-q) ** 2 for p, q in zip(list1, list2)]
    return sum(squares) ** .5

d2 = distance(test2[0], test2[1])

With numpy is even a shorter statement.

PS. python 3 recommened

回答3:

test1 = [[0.0, 0.0, 0.0, 152.0, 12.29], [0.0, 0.0, 0.357, 245.0, 10.4], [0.0, 0.0, 0.10, 200.0, 11.0]]

test2 = [[0.0, 0.0, 0.0, 72.0, 12.9], [0.0, 0.0, 0.0, 80.0, 11.3]]

final_list = []

for a in test2:
    temp = [] #temporary list
    for b in test1:
        dis = sum([pow(a[i] - b[i], 2) for i in range(len(a))])
        temp.append(round(pow(dis, 0.5),4))

    final_list.append(temp)
print(final_list)

回答4:

I got it, the trick is to create the first euclidean list inside the first for loop, and then deleting the list after appending it to the complete euclidean list

import math
euclidean = 0

euclidean_list_complete = []

test1 = [[0.0, 0.0, 0.0, 152.0, 12.29], [0.0, 0.0, 0.357, 245.0, 10.4], [0.0, 0.0, 0.10, 200.0, 11.0]]

test2 = [[0.0, 0.0, 0.0, 72.0, 12.9], [0.0, 0.0, 0.0, 80.0, 11.3]]

for i in range(len(test2)):
    euclidean_list = []
    for j in range(len(test1)):
        for k in range(len(test1[0])):
            euclidean += pow((test2[i][k]-test1[j][k]),2)      
        euclidean_list.append(math.sqrt(euclidean))
        euclidean = 0
        euclidean_list.sort(reverse=True)
    euclidean_list_complete.append(euclidean_list)
    del euclidean_list

print euclidean_list_complete

来源：https://stackoverflow.com/questions/50637446/computing-euclidean-distance-with-multiple-list-in-python

标签

python

list

euclidean-distance