Loading....

High Volume Event Processing

Setting Up Your Environment to Process High Event Volumes

First things first! A carefully planned IT Ops environment is the foundation of handling high event volumes. RightITnow ECM makes it easy to configure display settings, connectors, and users. ECM also empowers you to easily create entity groups that group entities into groups for more efficient maintenance, monitoring, and flexible reporting. Lastly for setting up your environment, create actions you can automatically trigger when RightITnow ECM encounters conditions set by you. This is a way to resolve issues before they make their way to the help desk. You may also add actions to the Alert Context menu that is available to you when you right-click an alert on the Alerts tab.

Deduplicate Events to Reduce Alerts You Need to Address

You can use ECM to create categorization rules that determine if an incoming event should be de-duplicated into an existing alert or if it should become a new alert. This greatly reduces the high event volume into something much more manageable. Instead of several hundred alerts all indicating that you have a full disk on your mail server, you just need one, with all others deduplicated into it.

Take Automatic, Resolving Action with Correlation Rules

ECM’s Correlations tab helps you create rules that trigger actions when RightITnow ECM encounters conditions set by you. For example, RightITnow ECM may request a new polling event via a linked monitoring system in order to check on the status of a device that has been inactive for some period of time. Based on the results, it can decide to raise or lower the priority of the alert, email a supervisor or escalate the alert to the Service Desk as an incident for deeper troubleshooting. Removal of open alerts when a problem solution pair is encountered is another example of IT operations process automation.

Use the Alerts Console and Dashboards to Understand Your High Volume Alert Data

ECM offers the Alerts Console as a single pane of glass to see all your alerts pouring in from internal and external systems. This gives you a great look at what your facing and also enables you to take actions on these alerts. Additionally, ECM offers many powerful, customizable dashboards that slice, dice and analyze your high volume alerts data in the most efficient way for you. See http://www.rightitnow.com/operations-management/it-operations-management-dashboards/ for more on our powerful dashboards.

SLA Processing

SLA Breach from RightITnow

Automated SLA Processing Begins with Detailed Information

RightITnow ECM now stores detailed user information, including the user work hours and time zone. User or user group work hours are accessible while defining conditions on both Correlation and SLA rules. All alerts timestamps are now stored in the local time zone that the server is deployed in and are converted back to the user specific time zone for the UI display because in many cases, users are spread across the globe and hence work in different time zones. Lastly, an availability indicator appears next to available assignees when assigning alerts via the Alerts console.

Customizable SLA Rules and Scheduling to Match Customer Needs

ECM provides a robust SLA rule engine that allows for highly customizable SLA rules and corresponding escalation steps. Critical alerts coming off your Payroll server can be processed immediately, assigned to your Application team while informing the key constituents Via SMS or email. If the situation was to stagnate, you can automatically upgrade the priority and reassign these alerts to a SWAT IT team while informing a wider set of managers.

SLA Rules that Take Decisive Action

You can create SLA rules that take resolving actions on offending alerts. For example, you can now use the Merge Alerts action as part of SLA rules to merge alerts into logical units you can address all at once. You can also create SLA rules that assign alerts to user groups to get many minds addressing the SLA breach at once. Once ECM makes the assignment, the Assigned Group field persists across the Actions, Correlations, Alert Console, Export/Reports, and SLA modules, allowing you to apply the concept of user groups to actions, correlations, alert manipulation, reports and SLA rules.

Knowledge and Maintenance of Your SLA Data is Key

You can use the Reports utility to generate scheduled reports on SLA breached alerts, greatly expanding on the information available in the SLA Breach Log. And to manage legacy SLA data, the Purge Utility can purge event records for alerts and SLA breaches in addition to closed alerts and audit records. Additionally, you can configure a stale alert warning for when SLA rules perform actions on alerts that have changed before the user refreshes the alerts console.

IT Operations Management Dashboards

Multiple IT Operations Management Dashboards on a Single Pane of Glass

ECM’s Dashboard tab provides you with a visual depiction of your IT Operations Management world. At a single glance, you can get a visual representation of the most recent alerts, historical trends, maintenance windows, system health, alert distribution, event processing distribution, event processing historical trends, correlations historical trends, and alert priority. You can even add a window that displays a useful website on the dashboard.

Add, Subtract and Rearrange Management Dashboards to Your Needs

Your environment is unique and you can tailor a suite of displets on the dashboard that make sense for you. You are not locked in to displaying all the dashboard displets. You pick and choose what you want. You may not need a web page displet, so, just delete it. You may have some new admins on board, so, display the Getting Started displet to shepherd them on their way to effective ECM usage.

Configure Management Dashboards to Your Needs

Each management dashboard displet offers an Edit feature that you use to tailor each displet to your specific needs. For example, in the System Health displet, you can configure the displet to display alerts against entity groups, services, or users. You can also specify which alert states to display and how to sort the data, and the refresh rate. Display several views of your data center side to side by selecting user-defined filters.

Create Multiple Management Dashboard and Link Them

You can create multiple dashboards and keep them to yourself, or share them with specific groups. You can save a dashboard configuration and link it to the current dashboard, reflecting current settings, and allowing you to change the visibility, order and size of grid columns on the Alerts displet, and the sorting, freezing and grouping of the grid on the Alerts displet.. You can also set a Default Dashboard setting to create a default dashboard for all users, so that everyone is on the same single pane of glass when observing your IT Operations Management world.

Automatically Managing Alerts and Processing Events

Multiple Correlation Rule Types to Manage Alerts and Process Events

You can take actions on alerts automatically using ECM’s powerful correlation rules feature. You create the correlation rules needed by your enterprise and ECM automatically executes an action on an alert after evaluating a correlation rule as true against that alert.

First Step in Managing Alerts and Processing Events with Correlation Rules

After ECM creates an alert, it enriches the alert with the following types of correlation rules:
Maintenance rule – Sets the maintenance field to TRUE if the alert’s entity is in maintenance. You can add an additional action or action group to this default behavior, for example, sending an email informing the supervisor of the maintenance period. You may also deduplicate against alerts that are not maintenance.

Close Maintenance rule – Dynamically closes a maintenance window based on a condition from an incoming alert.
Tag rules – Sets the Tags column in the Alerts table to the value specified in the rule. For example, if the message contains the word, “postfix” or “sendmail,” then set the Tags column value to “email:”

Second Step in Managing Alerts and Processing Events

After applying the enrichment rules, ECM executes the next wave of correlation rules against the alert:
Upon Event Arrival – Rules that trigger actions based upon the alert attributes, for example, if the entity has an owner, set the owner of the alert to be the same as the entity. These rules are evaluated upon arrival of the alert or modification of the alert’s event count.

Periodic Rules – Rules that execute actions on a periodic basis based upon the alert attributes, for example, a rule that sets the alert severity to Clear when the severity is Info.

Problem Resolution Rules – Rules created in the Alerts table by identifying one alert as a problem and another alert as the solution to that problem.

Timed Conditions (X in Y) – Rules that trigger actions if an alert occurs X number of times over a Y period. If you only want to act if an alert has occurred X times without restrictions over the period, you can use a correlation rule (upon creation) and use the eventcount field in the conditions section.

Managing Alerts and Processing Events: You Make the Rules!

Use RightITnow ECM to take complete control of and exploit all of that information flooding in from your entire universe of entities. ECM uses maintenance, close maintenance, tag, upon event arrival, periodic, problem resolution and timed conditions correlation rules that you create and customize to manage alerts and process events with as little intervention from you as possible, freeing you from tactical fire fighting to think strategically about the direction and evolution of your IT Ops.

Configurable IT Operations Alert Menu

Handling Alerts with the Configurable Alert Context Menu

Thanks to ECM’s categorization and correlation rules, many IT Operations events are resolved before they have a chance to become an alert on ECM’s single pane of glass Alert Console. Once an alert makes it to the Alert Console, you can configure and use the Alert Context menu to act on alerts in a multitude of ways.

Out of the Box Alert Context Menu Functionality

By default, the Alert Context menu allows you to right-click an alert and change severity, change priority, assign it, close it, unacknowledge it, and more, all without leaving ECM’s single pane of glass Alert Console.

Configuring the Alert Context Menu for Third-Party Integrations

In addition to the default Alert Context menu functionality, you can configure the Alert Context menu to interact with third-party systems such as Nagios, Zenoss, and SolarWinds. For example, you could configure the Alert Context menu to offer such items as “Open in Nagios,” “Annotate Zenoss Event,” and “Update Incident ID.” This way, you can achieve bi-directional integration with your third-party applications.

Configuring the Alert Context Menu for Grouping and Tagging Alerts, and More

Aside from the default and third-party Alert Context menu functionality, you can configure the Alert Context menu to help you organize alerts into groups, or roll up lots of alerts into a single alert. You can also add and remove tags to and from alerts and even reevaluate tag rules after performing other actions on an alert. If you would like to research an alert, you can add the Google Search command so that you can perform a Google search on any alert in the Alerts Console. Another powerful feature is that you can create a maintenance window based on an alert, directly from the alert in the single pane of glass ECM Alerts Console.

Filtering High Volume ITOps Alerts Streams

Filtering to Stem the Tide of High Volume ITOps Alerts Streams

It does not take long for your ITOps environment to amass enough entities to generate a huge high-volume alerts stream, especially if you are using our built-in connectors to such products as Zenoss, Nagios and SolarWinds. ECM not only helps you manage this stream by collecting all the alerts on a single pane of glass, but it also offers filtering features that make that single pane of glass more manageable and understandable.

Simple Grid-Based ITOps Alerts Filtering and Sorting

From ECM’s single pane of glass Alerts Console, you can filter the amount of Alerts displayed by entering a value in any of the grid’s columns to filter on that value. For example, you could enter “syslog” in the Connector column to display only those alerts associated with syslog. If you are really happy with the result, you can save this configuration as a repeatable, sharable filter. Additionally, you can click on any column to sort the Alerts grid on that column.

Advanced Filtering of the ITOps Alerts Stream

When simple, grid-based filtering is not enough, you can use ECM’s Advanced Filtering pane features to quickly build complex queries intuitively, and then save them as named filters that you can load later and share with other users and user groups. For example, you can use the advanced filtering condition builder to search for all SolarWinds alerts that are of high priority and unassigned.

Saving and Sharing Powerful Filters for High-Volume ITOps Alerts Streams

Once you have created a really useful filter, you can name it and save it for reuse, and also share it across users and user groups. For example, let’s say that you have a user group, “Nagios Administrators,” then you could create a filter that displays high priority Nagios alerts and share the reusable filter with the Nagios Administrators user groups so they will have it at their ready whenever logged in.

Configuring the Alerts Menu

The success of your IT operations depends on how quickly you can act on incoming events from your network infrastructure. For this it is imperative that you are able to configure your system to give you quick access to your most frequently used operations.

Configuring for immediate action

The RightITnow ECM enables you to create various custom actions to apply to incoming alerts. The alerts console menu can be customized to give you fast and immediate access to any of these actions.

Configuring for action efficiency

Creating a group of actions could enable you to set the severity of an alert, assign an owner and email that user about the new alert, just with two clicks. Any preferred group of actions is just at your fingertips via the alerts console menu.

Configuring for user efficiency

Administrators’ views need not be populated with actions used exclusively by the Operator staff. Focus on improving the efficiency of your individual users by creating context menus specific to the user roles or user groups in your system.

Managing Entities in ECM

Getting Your IT Ops Entities on a Single Pane of Glass

Attempting to deal with events from a large network infrastructure can be overwhelming, especially when the visualisation of your infrastructure is fragmented or incomplete. Efficient management of IT Operations requires you to focus your attention on the most important incoming events at any given time. For this, it is important to have a clear picture of the sources of these events.

Part of the role of RightITnow as a single pane of glass is to enable you to create a coherent and organised view of the entities you are dealing with, be they physical or virtual machines, applications or components.

Grouping and Mapping IT Ops Entities on a Single Pane of Glass

You can choose to group your entities geographically or by any logical grouping that suits your requirements. Among other advantages, this will enable you to focus exclusively on the events or alerts received from entities belonging under a particular group. You can also map your entity hierarchy relationships, for example, a physical server might host several virtual machines which are also event-generating entities in their own right, and this relationship appears in tree view within ECM.

Single Pane Enhancement of IT Operations Management of Entities via Entity Merging and Aliases

Events from the same entity may not necessarily be generated in a consistent manner and may appear to be coming from different sources. RightITnow enables you to correct such fragmentation by merging several apparent entities into a single actual entity with several aliases.

Single Pane Graphical Visualization of Your IT Ops Entities

RightITnow is consistently creating new features by which you can achieve a transparent view to your infrastructure. The more recent entity graph feature provides a graphical visualization of your infrastructure, thus facilitating the location of problem entities which require attention.

Mobile IT Ops

Mobile IT Networking Tools from RightITnow

Mobile IT from RightITnow

Mobile IT Ops – Secure, Comprehensive, Integrated

With users increasingly relying on their mobile phones for everything ranging from news to work mail, it becomes imperative for IT Operations to offer a mobile solution so that IT Ops professionals can check status and address issues on the go. It is great to have a mobile IT Ops app, but it needs to really perform all the required tasks and functions to justify the space it takes on your smart phone. The hallmarks of a good mobile solution are:

Mobile IT Ops – Secure Authentication & Authorization

The operations staff should be able to perform their designated tasks while staying within bounds of their permissible views and operations.

Mobile IT Ops – Comprehensive view of the system

The app should provide the user with a comprehensive view of the system with sufficient information to act upon the information received.

Mobile IT Ops – Integration with ServiceNow®, JIRA, Zennos, Nagios, and More

The solution should be well integrated with existing systems so that the operator can at once act accordingly – be it raising a ticket in ServiceNow® or executing a script remotely.

Integration with ServiceNow®

Servicenow Integration from RightITnow

Bridging the Gap Between IT Operations and ServiceNow® Helpdesk Support

Disconnected IT Ops and ServiceNow® Silos and the Frustrated End-User

Even though IT operations and Helpdesk support tend report to the CIO and work in the same organization, it is fairly common for these two groups to work as independent silos in total isolation and with a totally different set of tools and objectives. This greatly impacts the company’s ability to address actual customers issues. How many times have we heard of service disruption directly from an irate end-user despite the underlying causes of these issues having already been reported in IT Ops?

Reactive Mode ServiceNow® and IT Ops Issue Management

In recent years, the ITSM market, led by companies like ServiceNow®, has been deploying a new generation of helpdesk solutions in the Cloud. However, these solutions are still mainly focused on helping support personnel deal with customer issues in a reactive, conversation mode, rather than in an active coordinated and planned mode. Moreover, IT operations remains in reactive mode, overwhelmed by various standalone monitoring systems, processing millions of application-related events and IT infrastructure alarms, with no ability to categorize, correlate, and automatically act on these events.

2-Way Active Mode ServiceNow® Helpdesk and IT Ops Issue Management

RightITnow ECM collects all the IT alerts into a single pane of glass and uses its categorization, correlation and integration features to provide the business context needed to resolve the issues, often automatically. This empowers your IT Operations and Service Desk staff to solve problems more quickly and efficiently. ECM’s native 2-way integration with ServiceNow® automatically creates tickets in ServiceNow® and populates these tickets with all the contextual information the support personnel would need from the application alarms. In return , IT operations personnel are constantly updated within ECM with ServiceNow® all the related updates created in ServiceNow®.

Collaborative IT Ops and ServiceNow® Integration and the Happy End-User

RightITnow ECM’s contextual knowledge of underlying foundational events, correlation results and links back to performance data (say, from an original monitoring source) where applicable, all contribute to faster problem resolution. With RightITnow ECM and its native integration with ServiceNow®, your IT Operations and Service Desk staff reduce the time spent reactively fire fighting and instead focus on acting on a manageable number of service impacting alerts. As a result, happy customers and end users gain from increased availability and uptime, and assured performance against their service level agreements.

Back To TopBack To Top