Being from Florida, the threat of hurricanes usually prompts an annual disaster recovery drive where businesses envision all sorts of doomsday scenarios and start talking to our partners about how they can never ever go down. Usually, these companies estimate the cost of down time at some insane dollar-per-hour value. The problem is that when you ask exactly what risks they need to mitigate, few can answer. Nevertheless, they go on about their need for redundant circuits and hosting facilities in the middle of nowhere.
Decision makers all too often think of DR as being just redundancy, but we encourage our partners' clients to go through a more involved process of identifying their risks before they decide how or even whether to mitigate them.
While this is by no means a comprehensive list (nor is it in any particular order), it's a starting point to get you and your clients thinking about what they should be scared of and just how far they should go (i.e. how much they should spend) to avoid it.
- Power availability
In any lightning-friendly area like Florida power availability is one of the biggest issues facing businesses. Even in new buildings in downtown areas, it's not uncommon for lightning to affect power.
Risks: Lightning-related power issues can bring equipment down or, in extreme events, cause permanent damage.
Mitigation: While it's possible to build your own backup power infrastructure, in most cases this just isn't economically feasible. A hosting facility with suitable UPS, backup generators, and fuel supply is usually a better option.
Physical connectivity to the network is another issue. This bogeyman is insidious. Back-hoes, edge router failures, and ill-conceived grooming operations all happen with regularity.
Risks: Connectivity issues can slow network traffic to a crawl or interrupt service completely.
Mitigation: There are so many points of failure for a network circuit it takes careful thought to completely mitigate the associated risks. Secondary circuits only really offer protection if they are 100% diverse — different conduit into the building, different last-mile route, different CO, etc. Other alternatives include different transport media altogether (cable, fixed wireless, or 3G/4G). Placing enterprise-accessible applications and data in a well-connected hosting facility can further mitigate these risks by making secondary connections easily accessible.
- Forces of nature
In any area with adverse weather, the possibility of fire, flood, and wind should be considered. Fire in particular is often not given enough thought.
Risks: Environmental events can destroy buildings and equipment in a way that makes recovery a long and drawn-out endeavor. Because these events are so destructive, they also frequently cause widespread data loss.
Mitigation: The best way to mitigate environmental risks is by placing mission-critical equipment and applications in a hardened hosting facility. Data loss risk can be mitigated with a disciplined off-site backup program that is regularly tested. Moving applications and data to cloud infrastructure can also provide security.
- Equipment failures
Technology fails. Period. Hard drives in particular have a tendency to fail at the least opportune time. These issues are surprisingly under-emphasized when DR is discussed.
Risks: Equipment failures can slow access to applications and data, cause extended outages, or destroy data.
Mitigation: Implementing hardware redundancy is fairly easy, and most IT managers make sure their servers all have RAID arrays. Application redundancy is a little harder and requires planning for data replication or clustering. Leveraging cloud infrastructure or virtual machine technology can mitigate the risk of equipment failure by de-coupling applications and equipment, but comes with its own set of risks (see below).
Security touches everything in IT, from data protection to application availability. What's worse, security can often have implications beyond day-to-day functioning. If your lax security exposes your clients' personal information you could have immense liability. Bear in mind security is as much infrastructure as process.
Risks: Slow or no application and data availability due to DDOS attacks, public exposure of proprietary data, both company intellectual property and client information.
Mitigation: There are a number of ways to address security, but almost none are a set-and-forget solution. There are managed security services, outsourced security specialists, and cloud-based network security products, but all require regular monitoring and testing. Moving applications to the cloud also merits a security discussion.
While not all IT risks are in our domain as communications infrastructure consultants, it's important to be able to talk holistically with your clients about risk. You may find that there are dangers your client wasn't aware of, or fears you weren't aware of. In any case, if you have a complete understanding of their IT objectives and fears, you will be in a much better position to offer a relevant solution.