How to get distance matrix using dynamic time wraping?

隐身守侯 提交于 2020-06-17 09:32:08

问题


I have 6 timeseries values as follows.

import numpy as np
series = np.array([
     [0., 0, 1, 2, 1, 0, 1, 0, 0],
     [0., 1, 2, 0, 0, 0, 0, 0, 0],
     [1., 2, 0, 0, 0, 0, 0, 1, 1],
     [0., 0, 1, 2, 1, 0, 1, 0, 0],
     [0., 1, 2, 0, 0, 0, 0, 0, 0],
     [1., 2, 0, 0, 0, 0, 0, 1, 1]])

Suppose, I want to get the distance matrix of dynamic time warping to perform a clustering. I used dtaidistance library for that as follows.

from dtaidistance import dtw
ds = dtw.distance_matrix_fast(series)

The output I got was as follows.

array([[       inf, 1.41421356, 2.23606798, 0.        , 1.41421356, 2.23606798],
       [       inf,        inf, 1.73205081, 1.41421356, 0.        , 1.73205081],
       [       inf,        inf,        inf, 2.23606798, 1.73205081, 0.        ],
       [       inf,        inf,        inf,        inf, 1.41421356, 2.23606798],
       [       inf,        inf,        inf,        inf,        inf, 1.73205081],
       [       inf,        inf,        inf,        inf,        inf,        inf]])

It seems to me that the output I get is wrong. For instance, as I understand the diagonal values of the ouput should be 0 (since they are ideal matches).

I want to know where I am making things wrong and how to fix it. I am also happy to get answers using other python libraries too.

I am happy to provide more details if needed


回答1:


Everything is correct. As per the docs:

The result is stored in a matrix representation. Since only the upper triangular matrix is required this representation uses more memory then necessary.

All diagonal elements are 0 the the lower triangular matrix is the the same as the upper triagular matrix mirrored at the diagonal. As all these value can be deducted from the upper triangular matrix they aren't shown in the output.
You can even use the compact=True argument to only get the values from the upper diagonal matrix concatenated into a 1D array.

You can convert the result to a full matrix like this:

ds[ds==np.inf] = 0
ds += dt.T



回答2:


In dtw.py the default value for elements of the distance matrix are specified to be np.inf. As the matrix returns the pairwise distance between different sequences, this will not be filled in in the matrix, resulting in np.inf values.

Try running with dtw.distance_matrix_fast(series, compact=True) to prevent seeing this filler information.



来源:https://stackoverflow.com/questions/62211066/how-to-get-distance-matrix-using-dynamic-time-wraping

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!