Recommended anomaly detection technique for simple, one-dimensional scenario?

后端 未结 3 1245
无人共我
无人共我 2020-12-07 14:45

I have a scenario where I have several thousand instances of data. The data itself is represented as a single integer value. I want to be able to detect when an instance is

3条回答
  •  没有蜡笔的小新
    2020-12-07 15:34

    There are a variety of clustering techniques you could use to try to identify central tendencies within your data. One such algorithm we used heavily in my pattern recognition course was K-Means. This would allow you to identify whether there are more than one related sets of data, such as a bimodal distribution. This does require you having some knowledge of how many clusters to expect but is fairly efficient and easy to implement.

    After you have the means you could then try to find out if any point is far from any of the means. You can define 'far' however you want but I would recommend the suggestions by @Amro as a good starting point.

    For a more in-depth discussion of clustering algorithms refer to the wikipedia entry on clustering.

提交回复
热议问题