The average cost of unplanned downtime is $5,600 per minute or $300,000 per hour, according to survey study by Gartner. Those numbers should be alarming for every business, especially companies operating in a retail environment that rely on their systems to complete hundreds or thousands of credit card transactions every day.

Even after the network uptime is restored, an unplanned outage could have a lasting impact if frustrated customers decided to purchase their goods from a competitor whose credit card devices were working properly.

Network anomaly detection is one incident management solution that can help your business manage incidents, minimize costly downtime and retain customer loyalty.

How to Choose the Right Monitoring Tool for Your Organization

The concept of network anomaly detection is certainly not new in the world of IT incident management. There are a lot of solutions available that offer businesses a tailored silver-lined approach to catch an actionable event when it happens.

But how do you know which one is right for your business?

As you research different monitoring tools for incident management, you’ll want to know which pieces of your infrastructure provide the most overall value. Ask yourself the following questions:

  • What area of your business infrastructure, or technology components, provides the most value to your organization?
  • Would your business survive if your customers were unable to use your proprietary interface that was designed to interact with them and complete a sale?
  • How long could you afford the downtime?
  • How can your business prevent your most valuable service from going dark?

If you could measure the individual number of specific actions your customer takes in receiving your product or service, this would provide some beautifully tunable metrics for different areas of your delivery that you may need to focus on improving.

With catching your customer’s actions, or (system events), you can theoretically piece together their unique journey, from start to finish, whether they made a purchase or not, where they stopped at along the way and why.

Why is this so important?

Understanding human behavior and the way that can translate to system behavior plays a big role in choosing an acceptable and efficient monitoring tool.

Odyssey Information Services’ EMS Viewer is an event management system monitoring tool that was developed to granularly depict every aspect of the HPE NonStop Tandem computer system and provides businesses with flexibility to customize an analyst’s monitoring environment to tailor it to their needs and values.

Understanding User Behaviors Can Lead to Quicker Detections in Real Time

EMS Viewer has the ability to not only detect real-time events or to trigger on specific hardware and software thresholds and system generated events, but each individual event can be customized even further.

The image below shows a customized event graph that was set up in EMS Viewer to monitor a client’s fuel pump transactions. The event recorded the customer purchasing journey over a 30-minute period and aggregated three different events from more than 12,000 point-of-sale devices that are geographically separated.

Incident management graph identifies network anomaly detection

The blue line signifies a connection reset, the green line is a connection stopped, and the red line is a connection started.
If the network is functioning properly, then every transaction at the pump will record the following behavioral pattern:

  1. Red Line — The customer initiates a pre-authorization sale to start the connection by swiping a card or paying cash inside and then proceeds to pour gasoline into a tank.
  2. Green Line — When the customer is done pumping gas, he or she stops the connection by placing the pump back on the rack and grabbing the receipt.
  3. Blue Line — After the receipt is given to the customer, the connection is reset and the point-of-sale application remains idle until the next customer arrives.

EMS Viewer provides a clear depiction of the normal flow of traffic compared to an outage event.
In this example, you can see thousands of station resets and stations stopped, but almost no stations starting from the POS devices all within the same couple minutes.

And because the POS devices are geographically separated and there were no recent application changes, an analyst can confidently determine that an application failure is highly unlikely.

What would happen to your business if it suddenly lost thousands of transactions over an extended period of time?

EMS Viewer gives you the ability to create custom series of real-time data, in which you can configure those critical system events to visualize trends happening on the fly. This can be invaluable in understanding the overall behavior and delivery of the service to your customers and to determine whether the surrounding technologies are functioning correctly.

By following proper escalation protocols, your team can quickly discover the cause of disconnect between your POS application and your customers’ bank accounts. In this scenario, the disconnect was caused by a cut fiber from a third-party upstream network delivery service whose ISP failover did not work.

Observe, React, Resolve

Suddenly losing thousands of POS systems over a period of time could be devastating to your business’ bottom line.

Using real-time metric reporting that’s available in event monitoring tools such as EMS Viewer ensures your analysts are on the ball—escalating and remediating an incident as fast as possible in order to prevent as much, if any, revenue loss during the time your customer was unable to complete a transaction.

Your analysts will be able to observe, react, and even resolve the incident before customers even have time to dial into the support center to report an outage.

About the Author

Steve Donaldson
Steve DonaldsonVP - Professional Services