AWS Cloudwatch Heartbeat Alarm

醉酒当歌 提交于 2019-12-22 02:07:26

问题


I have an app that puts a custom Cloudwatch metric to AWS every minute. This is supposed to act as a heartbeat so I know the app is alive.

Now I want to put an alarm on this metric to notify me if the heartbeat stops. I have tried to accomplish this using different cloudwatch alarm statistics including "average" and "data samples" and setting an alarm threshold less than 1 over a given period. However, in all cases, if my app dies and stops reporting the heartbeat, the alarm will only go into an "Insufficient Data" state and never into an "Alarm" state.

I understand I can put a notification on the "Insufficient Data" state, but I want this to show up as an alarm. Is this possible in Cloudwatch?

Thanks,

Matt


回答1:


Instead of pushing in a custom metric to Cloudwatch, consider:

Push a message onto an SNS topic, on the same periodic basis as you were doing, and set up a CloudWatch monitor for the SNS topic's NumberOfMessagesPublished metric. If the number of heartbeats falls below the expected value for the time period you specify, whether its because the app crashed, or server crashed, the metric will go into an Alarm state.




回答2:


I think that the alarm going into "Insufficient Data" state has to do with how missing data is being handled. As the doc states:

Similar to how each alarm is always in one of three states, each specific data point reported to CloudWatch falls under one of three categories:

  • Not breaching (within the threshold)
  • Breaching (violating the threshold)
  • Missing

You can specify how alarms handle missing data points. Choose whether to treat missing data points as:

  • missing (The alarm looks back farther in time to find additional data points)
  • notBreaching (Treated as a data point that is within the threshold)
  • breaching (Treated as a data point that is breaching the threshold)
  • ignore (The current alarm state is maintained)

The default behavior is missing.

So i guess that specifying missing data points as breaching would do the trick :)




回答3:


Treat missing data as breaching threshold (step 4)

Check this: https://cloudonaut.io/dead-mans-switch-with-cloudwatch/



来源:https://stackoverflow.com/questions/31573565/aws-cloudwatch-heartbeat-alarm

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!