Introduction
Scalyr Alerts have a number of helpful features that can be used to test and diagnose missing alert notifications. This article covers some of the best practices we use.
Testing
Instead of waiting until your alert is triggered, you can test your alert's notifications immediately by setting the trigger to a boolean value of true
After 1-2 minutes, you should begin to see notification activity.
Note: If you are using the #lastLogLines#
token, no log lines will be displayed if you are using a trigger of true
(since no logs were actually processed)
Confirm your alert's status
Check for webhook errors
In our experience, configuring webhooks with JSON payloads can require more initial effort, due to their formatting requirements. One way to quickly spot a problem is to search for $tag='webhookError'
If a faulty payload is causing your notifications to be rejected, you'll see log events similar to:
In the above example, I realized that I had inadvertently double escaped (\\\") my JSON payload in the alerts configuration, which resulted in the payload error (and no notifications being sent to Slack)
Review the alert's current status
If you're testing an alert, you can review its status and when it was triggered by searching for $tag='alertState'
and $tag='alertStateChange'
(respectively)
In rare instances, a new alert may not trigger within the usual 2-3 minutes because its histograms are still being created. Once these histograms have been generated, the alert will function as expected.
Searching for $tag='alertState'
is a good way to investigate a late / missing notification. If no matching log lines are returned, there's a good chance that histogram generation is the underlying cause.
Comments
0 comments
Please sign in to leave a comment.