Clustering multivariate time series - question regarding distance matrix

放肆的年华 提交于 2021-02-11 13:36:49

问题


I am trying to cluster meteorological stations using R. Stations provide such data as temperature, wind speed, humidity and some more on hourly intervals. I can easily cluster univariate time series using tsclust library, but when I cluster multivariate series I get errors.

I have data as a list so each list element is a matrix with time series data of one station (variables are columns and rows are different timestamp).

If I run:

tsclust(data, k = 2,
                   distance = 'Euclidean',   seed = 3247, trace = TRUE)

I get error: Error in do.call(.External, c(list(CFUN, x, y, pairwise, if (!is.function(method)) get(method) else method), : not a scalar return value

The same error I get if I try to calculate only distance matrix using

dist(data, method="euclidean")

Maybe Euclidean distance can not be calculated for such data? If yes, then what distances could be calculated?


回答1:


You supposedly can still use Euclidean.

You just have to implement it yourself, because the standard method only works for vectors, not for matrixes. But that should be trivial to implement yourself.

You'll likely run into scaling problems though if your variables have different units and magnitudes.




回答2:


If your series have the same length, you could just transform them into a vector and then re-adjust dimensions. However, like Anony-Mousse mentioned, using Euclidean distance with variables that have different scales could be problematic, so considering normalizing with zscore:

series <- zscore(data)
pc <- tsclust(lapply(series, as.vector), distance="Euclidean", seed=3247L, trace=TRUE)

pc@datalist <- series
# replace ncol with the actual number of columns from your data
pc@centroids <- lapply(pc@centroids, matrix, ncol=3L)


来源:https://stackoverflow.com/questions/55841627/clustering-multivariate-time-series-question-regarding-distance-matrix

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!