One of the biggest challenges for organizations is ensuring business keeps operating, even during adverse times. Disasters can strike at any moment, and service outages can cost your organization millions of dollars in revenue.
In today’s hyper-competitive market, it won’t take long for customers to switch to your competition if you are unable to deliver uninterrupted service.
Downtime is the killer of all things. In fact, 95 percent of companies that experience data loss or downtime for 10 days or more will file for bankruptcy within 12 months and 43 percent of businesses without a disaster recovery plan will go out of business in the wake of a major data loss.
This can usually only be countered with committed resources in the form of personnel, IT, and/or alternate facilities.
Barriers are usually measured in cost. The business must calculate the cost of putting mitigating factors in place to reduce the risk of downtime to a safe, comfortable level. In some cases, organizations will need to accept the risk due to budgetary reasons.
If you thought developing a business continuity plan (BCP) was hard, try explaining to your stakeholders why you didn’t.
Business continuity plans are necessary for several purposes:
- Determining the state of readiness of the organization to respond to and recover from a disruption to business, operations, and systems
- Determining whether the required resources are available at each recovery location
- Managing the expectations of the organizations as to what they can expect in the event of an actual incident
- Instilling a sense of calm and confidence within the organization by showing there is a demonstrable state of readiness for a potential disruption of services
- Demonstrating compliance with applicable regulatory requirements and best practices
- Identifying, documenting and making improvements in such areas as education, training, or plan content
- Determining whether Business Continuity Plans (BCP) have been properly maintained to reflect changes in the business
Your business can create an effective business continuity plan by following five steps: Identify, Analyze, Design, Execute, and Measure.
Step 1: Identifying Assets
To begin, let’s assemble a BCP management team. Congratulations—you’re it! Welcome to, as of now, the one-man relay team.
While you don’t have to go at it alone, for now, you are the head boss in charge. Your first job as your company’s BCP coordinator is to convince your stakeholders of the value of having a plan in place to protect the company’s assets. This can be one of the more challenging steps to take, but it is the most important. Otherwise, your efforts and your plan will lose traction and die a miserable short-lived existence.
What are your company’s assets?
Here is a short list some executive level managers may classify as their assets:
- Brand name
- Revenue stream
- Employees and their safety
- And the list goes on…
Every business is different so you want to begin to identify assets by having conversations with various executive-level members to find out what is important to them. When you give them what they need, you will find they are much less resistant to pledging resources to safeguard the things they hold dear.
Think of it as a sales process
First, they need to know what they have. Next, paint a picture of how they enjoy the assets and the value each delivers to the organization. Finally, ask them how they would feel if any of the assets were taken away and to what length would they be willing to go to safeguard them. Have them sell themselves to you.
Once you have determined what is important to your organization, it is now time to get executive-level buy-in and support to make it stick. There is nothing worse than going through all the work necessary to complete this marathon only to have everyone feel as though they do need to comply with new processes.
Step 2: Assessing Risks and Impacts
There are two main parts in the analysis phase: Business Impact Analysis and Risk Assessment.
While it does not matter which process you conduct first, it’s very important to complete both because they are co-dependent upon one another.
Let’s start with a couple definitions.
Business Impact Analysis (BIA)
Business impact analysis is the process of determining the criticality of business activities and associated resource requirements to ensure operational resilience and continuity of operations during and after a business disruption. The BIA quantifies the impacts of disruptions on service delivery, risks to service delivery, and recovery time objectives (RTOs) and recovery point objectives (RPOs). These recovery requirements are then used to develop strategies, solutions and plans.
Risk assessment is the identification of hazards that could negatively impact an organization’s ability to conduct business. These assessments help identify these inherent business risks and provide measures, processes and controls to reduce the impact of these risks to business operations.
In my experience, I like to conduct the RA first, then the BIA. I have seen pages of multi-level matrices of a hundred-plus events that could cause an impact to a business-critical function.
Are you kidding me? Who has the time to read through all that garbage?
Here is a quick list (in no particular order):
- Sudden death
- Severe weather events
- Terrorist threats
- Bomb threats
- Loss of power
- Gas leak
- Physical security breaches
- Cyber security breaches
- Server crashes
- Network outages
- Hostage situation
This is a “short” list, believe me. You are probably wishing that there was an easier way to list almost every possible eventuality. There is. What you will notice about the above list is they are all causes. What we’re really interested in is the effect.
To simplify this process, I have developed a very short list of effects from the listed causes.
It looks something like this:
- Loss of Facilities
- Loss of IT
- Loss of Personnel
You may be asking yourself, “Where’s the rest?” That’s it! Simple, isn’t it? Now you can plan recovery techniques for any one or combination thereof.
Well, actually, it’s not quite that easy. Now you need to go to each department (this is where you recruit help) and have the business unit’s point of contact answer the question: “How would the above items affect your department?”
Armed with information, you can move on to the next step: Designing your BCP
Step 3: Creating Actionable Items and Strategies
This is the step in the BCP process where the rubber meets the road by translating all that raw data into actionable items and developing recovery strategies.
Communication is key during and after a disaster. In an effective BCP, people come first. It’s essential to have a functional means of communication with employees, vendors, and customers. Again, a clear communications path is essential. This communication should be templated out for the three scenarios identified during Phase 2 (Loss of Facilities, IT, or Personnel) and allow for specific details of the incident that caused the activation of your organization’s response plan.
A delegation of authority for each department and for the executive branch of your organization. In addition, vendor escalation paths being mapped out, alternative vendor(s) must be identified. This is especially important should your vendor be the reason for the incident.
Determine the business impact of a function or process first, and then develop recovery capability for it. Your objective in this phase is to identify the people, facilities, and assets that are required to achieve the four “R’s” which are: Response, Resumption, Recovery, and Restoration.
Next, let’s consider the following chart:
The timing may vary by department and may have dependencies on yet other departments. For example, while the payroll department may only have one person on staff and even though that person may be ready, remotely, to carry out their duties, the IT department may not have the payroll server back up and running or enough VPN licenses to satisfy the requirement for employees to work remotely.
What does recovery look like? How will it be implemented? Can it be implemented without further impacting the RTO or the RPO?
Please do not expect to get all these kinks out on your first iteration. Be prepared to be blindsided by unknowns; they are always there waiting to ruin your perfectly laid plans.
Step 4: Test, Test, Test!
This is the step that I like to refer to as the testing phase because this is where all your hard work is put to the “test.”
Risk is not static. Personnel changes, potential threats, and critical business functions will change over time. A BCP must be validated through testing or practical application and must be kept up to date.
All those long hours collecting data, developing matrices, and processing documentation are about to pay off through validation. There are several methods at your disposal. My three favorite methods are tabletop discussion, departmental testing, and full-blown BCP activation.
Word of Advice
The purpose of every method is to identify gaps in your plan. If you didn’t find anything wrong, you are probably not measuring the right things.
With any of the testing, it is not to be used as a finger-pointing session but rather as a chance to learn how and why the process was wrong and then remediate to eliminate the risk of a negative outcome.
TOP 3 TESTING METHODS
Method #1: Tabletop Discussion
This method is just like it sounds. Department heads and managers gather around a table, or in online meetings, and go through your BCP line by line. If this is the first time, it will help to identify inter-departmental dependencies that had not been initially identified and adjustments to staffing requirements that will be necessary to satisfy the minimum business requirements while still delivering on any agreed upon SLAs.
Make sure you take really detailed notes about anyone’s concern(s). You will now proceed to the measure phase where you measure the efficacy of everyone’s hard work. Once you have weeded out most, if not all, of the bugs then you may proceed to a departmental BCP test.
Method #2: Departmental Testing
No more discussing the idea of running a scenario against the department. It should be communicated to the department that a BCP test will be taking place, but do not discuss what the scenario will be about. Instead, have them review the BPC document in advance so the staff members involved will be familiar with what may be asked of them. This test is your staff’s chance to mess things up! Why wouldn’t they perform as expected, you may ask? I read in a medical report once or a webinar I had attended (can’t remember which) that a stressful situation can affect a person’s cognitive functions by two to five grade levels. That is definitely something to consider.
One of the main purposes of these exercises is for the staff to become comfortable with “BCP’ing.” Should your company’s BCP be activated, it should almost feel like their daily routine. Repeated testing is the only way to combat the negative effects of stress. Again, record detailed notes of process gaps and staffing concerns. Continue to the final measuring phase.
Method #3: Full-Blown BCP Activation
Remember the three business interruptions (facility, IT or personnel)? Now you must come up with a scenario that affects the business in one or more of those categories (please don’t blow up the office and kill all your staff on the first try).
Send out a communication to your organization stating that a BCP test will be conducted. You may also include suppliers. Do not let them know what will be tested. Rather request that they review their BCP (now handbook) to refresh them of what their duties may entail. You can anticipate that you and your staff may feel anxious on the day of the test. Stick to the plan, see what breaks, document gaps and staffing concerns. Please advance to the measure phase.
How often should you test?
I advise that you conduct each test type until the staff seems comfortable before progressing to the next. Remember after each test type, you must measure your efforts and make revisions. The full-blown BCP testing should be conducted at least once a year, so I always recommend testing as often as the business allows.
Step 5: Continuous Improvement
This final phase of your business continuity plan will focus on continuous improvement. Think about it as a never-ending phase that continuously answers the following question: How does your BCP mature with your organization?
A BCP is a living document. You can’t just “set it and forget it.” In other words, it changes or morphs alongside the organization as it matures. There are likely elements in a BCP that look good on paper, but they’re difficult to implement in the real world.
This is where measuring and testing come into play.
A unique, robust set of metrics must be developed for each organization to measure benchmarks that matter. If it is easy to measure, you are usually measuring the wrong things.
Word of Advice
Your dashboards should not be used to identify faults in how the individual variable performed, but rather how or why the solution was ineffective.
Ask yourself the right questions
Here is a list of questions that I often answer when I’m helping clients measure the effectiveness of their business continuity plans:
- Were the instructions too complicated to complete the task?
- Would additional resources reduce the risk to an acceptable level or will it be impossible due to budgetary constraints?
- Is there a single point of failure?
- Would additional practice better prepare staff?
- Were the mitigating factors effective in reducing the risk?
Of course, you want all checkmarks in front of the easily measured Items. These are the items CxOs enjoy seeing in a report. Pie and bar charts are so pretty!
- Does the measured item satisfy RTO? (Measures downtime)
- Does the measured item satisfy RPO? (Measures loss of data)
- Did the transition from DR back to production cause any additional downtime? If so, how much?
- Is it acceptable?
Rinse and Repeat
How often should you repeat this process? If this is your first attempt, then I would suggest scheduling an after-hours tabletop discussion where each of the affected business units talk their way through the process. Try to use this session to identify deficiencies or gaps in the process and minimum staffing requirements needed to satisfy your business goals.
The next step would be to put the rubber to the road and individually test each business unit with a unique scenario.
Don’t be afraid to fail
Failing now is a good thing! We learn from our mistakes. For this reason, frequent practice in the beginning stages of testing your plan is a must.
If teams are frequently satisfying business requirements by year one, then testing once a quarter might be suggested. After the second year, as long as there are no significant changes to the business or technology involved, twice a year would be acceptable.
Even with a mature plan in place, testing less than twice a year could have negative effects as individuals will start feeling the effects of higher stress levels due to unfamiliarity.
About the Author