Using Event Correlation to Automatically Raise Downed Server Alert Severity
Downed Web Server, but Operators Do Not Notice without Event Correlation
Your network operations center is super busy and your operators are too busy deploying hundreds of virtual systems for a new site to notice that a downed web server alert now has more than 200 associated events. That server has been down for quite a while and it would be nice to get it back up and running before your web-space clients start flooding your Zendesk with batches of associated tickets! Wouldn’t it be nice if you could write a rule that would automatically change the severity of that downed server alert to CRITICAL so your operators would take notice? You can with RightITnow ECM correlation rules!
Define the Downed Web Server Correlation Rule Conditions
Using the RightITnow ECM Correlation Rule Builder, you can instruct the RightITnow ECM correlation engine to match all events with a count greater than 10 whose description contains the words, “Web Server Down.” Once you have collected these events, you can act on the resultant alert.
Set the Downed Web Server Alert Correlation Action
So, what do you want to do with all of these downed web server events? You want to make sure that the resultant alert is assigned a very high severity so your operators will jump in fast to resolve the issue. You can use the Correlation Rule Builder to trigger the Set to Critical action that would set the severity of any qualifying alert to critical, automatically. Now, that alert lights up in bold red on your alert console, crying out for some operator attention and intervention.
As Long as We Have Focus on this Event, Let’s Do More!
You have built a correlation rule that automatically and reliably collects downed server events into an associated downed server alert and sets the priority to CRITIAL. As long as you have focus on this alert, you can do more with the correlation rule than just raise the severity to CRITICAL. You can use the RightITnow ECM Correlation Rule Builder to check if related following alerts exist, for example, is my mail server also down? You can also instruct the rule to perform an action if the condition is NOT true. For example, if you are getting downed server events, but not more than 10, you can trigger the Set to Medium Priority rule to set the alert severity to medium instead of CRITICAL.