Investing upfront in the mitigation of potential disasters will save your company and network in the long run. In the world of reliable hosting, for example, each infrastructure deployment includes all kinds of high availability (HA) and disaster recovery (DR) solutions. Investing in HA and DR solutions upfront will enable business continuity, avoid a lot of stress, and save you from the potentially devastating recovery costs.
What is disaster recovery?
According to TechTarget, “disaster recovery is an area of security planning that aims to protect an organization from the effects of significant negative events. DR allows an organization to maintain or quickly resume mission-critical functions following a disaster.”
This means that implementing DR requires a different approach for every organization, as each organization has its own mission-critical functions. Typically, some mission-critical functions run on or rely on IT infrastructure. Therefore, it is good to look at DR within the context of this (hosted) infrastructure; however, it should be part of business continuity planning as a whole.
Important questions to ask when you plan and design your mission-critical hosting infrastructure include:
- How much time am I prepared to have my mission-critical functions unavailable (RTO)?
- How much data am I prepared to lose, i.e. the time duration for which you will not be able to recover your data (RPO). For example, if you safely backup your data once a day, you can lose up to one day of data when a disaster happens.
- How much money will it cost the organization (per hour) when the mission-critical services are not available? DR measures include prevention, detection and correction.
Disaster recovery for common failures
Most hosting services include disaster recovery for most common failures such as failure of a physical disk, server, network switch, network uplink connection, or power feed. This is referred to as High Availability (HA).
A redundant setup solves failures as if an element fails, another infrastructure piece takes over. Redundant networking devices and cabling, multiple power feeds, seamless failover to battery power, and separate power generators that can run forever play an important role in keeping IT infrastructure and thus your software services up and running. Also in case of a fire in a data center, the fire is typically detected early and extinguished through gas (reduction of oxygen), without even affecting most equipment in the same data center hall. This means that most ‘disasters’ are being recovered without impacting the availability of the infrastructure services.
One of the most commonly used tools in DR is creating a frequent backup of your data. If a disaster occurs, you can then restore your backup and relaunch your mission-critical functions and other services.
For faster relaunch of your services after a disaster, replication of your application servers and data can come in handy, as it is readily available to relaunch, compared to backups that would first need to be restored (which takes more time).
Preparing for critical disasters
To mitigate risks of larger disasters which are much less likely to happen, an alternative IT infrastructure environment to run your mission-critical functions can help to enable your business continuity.
Some choose to backup critical data to another location. Others replicate application servers and data to another location, with available hosting infrastructure, to be able to relaunch application services quickly or to have a seamless failover without service interruption.
In case you need to mitigate the risk of failure of the entire environment, the common solution is to include a failover data center site in your IT infrastructure setup. Disaster recovery by means of adding an alternative data center (also called Twin DC setup) also requires a tailored approach to identify the right setup for your applications and mission-critical functions.
Another important facet is to implement applications that can deal with infrastructure failures. Where in the past it was more common to trust on the underlying infrastructure for high availability, it has become more popular to implement applications in such a way that underlying (cheaper) infrastructure may (and will) fail, without impacting the availability of the mission-critical functions.
This means finding a balance between investing in more reliable hosting infrastructure, applications that deal with failures in the underlying infrastructure, and planning and preparing failover to an alternative infrastructure environment.
Making optimal use of DR investments
To make optimal use of DR investments you can choose to use the extra resources in a second datacenter even when there is no failover due to a large disaster in the primary data center location. You can spread workloads between both data centers, for example with half of the workloads running in each data center A. During a disaster, non-mission-critical services can be stopped to make space for mission-critical services to failover.
Another example is when all applications run in the primary data center, and only those applications and data related to the mission-critical functions are replicated and fail over to a second data center in case of disaster (active-passive).
The main takeaways
As every business is different when carrying out business continuity planning every organization should have their own approach to disaster recovery. The challenge for these organizations is going to be balancing the tools and methods available. The goal, however, should be clear for everyone – invest upfront to prevent higher recovery costs in case of a disaster.