Mastering Disaster Recovery - Part 3 : Business Continuity and Disaster Recovery Planning

Mastering Disaster Recovery - Part 3 : Business Continuity and Disaster Recovery Planning

In my quest to master disaster recovery, I discovered that backups are a crucial part of any disaster recovery plan. But what exactly is disaster recovery planning?

In this blog, let's explore what a disaster recovery plan is, why we need it, and how it contributes to an organisation's broader business continuity plan.

The objective is business continuity

Business continuity is the capability of an organisation to continue operations of products and services following any disruptive events. These disruptive events include natural disasters like earthquakes, floods and solar storms; other events like terror attacks, arson, security breaches and data breaches. Disruptive events can also be socio-economic factors like global recession, trade deficits, and loss of customers.

Organisations plan and create systems to prevent and prepare for recovery from any of the above mentioned threats. The plan enumerates a range of disaster scenarios and lists the roles, responsibilities and steps to recover regular trade. The key focus is on preparedness, protection, response and recovery strategies.

While business continuity focuses on the overall resilience of an organisation, disaster recovery planning is a crucial subset that specifically addresses the restoration of IT systems and technology operations.

Disaster recovery planning is a subset of Business Continuity

Business continuity (BC) and disaster recovery (DR); often collectively referred to as BCDR, are two related but distinct approaches to ensuring an organisation's resilience. BC is a proactive strategy that focuses on minimising risks and ensuring the organisation can continue to operate and deliver products and services during a disaster event by defining ways for employees to continue their work. On the other hand, DR is a reactive subset of BC that concentrates on the specific steps needed to resume IT systems and technology operations after a disaster occurs, and is implemented only when a disaster actually strikes.

To develop an effective disaster recovery plan, it is essential to first understand what the business needs to protect. This involves creating an inventory of critical assets, identifying stakeholders, and cataloging procedure and business process documents.

Understand what the business needs to protect

Create an Inventory

What do we need to protect?

  • IT Equipment: Catalog all hardware, software, and network infrastructure, including servers, workstations, routers, and storage devices.

Who are we dependent on?

  • Contractors: List all external contractors and their roles in maintaining or supporting the organisation's IT systems.

  • 3rd Party Vendors: Identify all vendors providing essential services, products, or support to the organisation's IT operations.

Where do we keep data and where do we backup to?

  • Primary Sites: Document the main locations where business operations and IT systems are housed.

  • Recovery Sites: Identify alternative locations that can be used to resume operations in the event of a disaster.

Identify Stakeholders and key responsibilities

Who are the decision makers?

  • Executive Management: Support, approve, and communicate the disaster recovery plan; make critical decisions during disasters.

  • Business Unit Leaders: Collaborate with IT to identify critical processes; provide input for BIA; develop business continuity plans; ensure team awareness; participate in testing and training; coordinate during disasters.

Who will implement and maintain it?

  • IT Department: Develop, implement, and maintain technical aspects of the plan; prioritise IT system recovery; ensure data integrity and availability; conduct testing and updates.

Catalog Procedure Documents and Business Process Documents:

  • Procedure Documents: Create and maintain a comprehensive set of documents detailing the step-by-step procedures for critical IT operations, such as system backups, data restoration, and emergency response protocols.

  • Business Process Documents: Document all essential business processes, including their dependencies on IT systems, to ensure a clear understanding of how technology supports the organisation's operations.

Once the critical assets and stakeholders have been identified, it is crucial to collaborate with the business to determine the potential impact of a disaster. This involves conducting a Business Impact Analysis (BIA), Threat and Risk Analysis (RA), and Impact Analysis.

Collaborate with Business to decide what is the impact of a disaster

Business Impact Analysis (BIA):

  • Identify and prioritise critical business functions and processes.

  • Determine the potential impact of disruptions on each function or process, including financial losses, reputation damage, and regulatory consequences.

  • Establish recovery time objectives (RTOs) and recovery point objectives (RPOs) for each critical function or process.

Threat and Risk Analysis (RA):

  • Identify potential threats to the organisation's IT systems and operations, such as natural disasters, cyber-attacks, and equipment failures.

  • Assess the likelihood and potential impact of each threat.

  • Develop strategies to mitigate or eliminate identified risks.

Impact Analysis:

  • Evaluate the consequences of disruptions on the organisation's overall operations, including the impact on employees, customers, and stakeholders.

  • Identify the interdependencies between various business functions and IT systems.

  • Determine the resources required to maintain critical operations during a disaster event and to recover from disruptions.

With a clear understanding of the potential impact of a disaster and the critical assets that need protection, the organisation can finally build a comprehensive Disaster Recovery Plan (DRP). The DRP should outline the scope, objectives, and general contents necessary for effective disaster recovery.

Finally build the Disaster Recovery Plan

A disaster recovery plan is a document with information about how to resume operations from any disruptions. The disaster recovery plan mainly consists of an organisation’s IT infrastructure. The goal of the disaster recovery plan is to minimise data loss, recovery time and ensure system integrity and availability is returned to an acceptable level.

With the BIA, RA and impact analysis reports, we can identify the the impacts of disruptive events and sets the context for RPO and RTO objectives.

Scope of a DRP

The scope of a DRP encompasses all critical IT systems and infrastructure that support the organisation's core business processes.

Objectives of the DRP

  1. Minimise downtime and data loss during a disaster event

  2. Ensure the timely restoration of critical systems and applications

  3. Maintain the integrity and availability of data and systems

  4. Provide clear guidance and direction to staff involved in the recovery process

  5. Comply with regulatory requirements and industry best practices

General contents of a disaster recovery plan

  1. Introduction

    • Purpose of the DRP

    • Scope of the plan

    • Objectives of the plan

  2. Roles and Responsibilities

    • DRP team members and their contact information;

    • Roles and responsibilities of each team member

  3. Incident Response

    • Incident detection and reporting procedures

    • Incident classification and prioritisation

    • Communication plan (internal and external)

  4. Inventory

    • Inventory of critical IT systems and infrastructure

    • Identify backup tools and mechanisms for different workloads and infrastructure

    • Primary Site information

    • Secondary Site information

    • Off-site Backup Location

  5. Business Impact Analysis (BIA)

    • Identification of critical business processes

    • Recovery Time Objectives (RTOs)

    • Recovery Point Objectives (RPOs)

    • Prioritisation of recovery efforts

  6. IT Systems Recovery Procedures

    • Backup and data replication procedures

    • Step-by-step recovery procedures for each critical system

  7. Vendor and Third-Party Coordination

    • Contact information for key vendors and third-party service providers

    • Procedures for coordinating with vendors during a disaster

  8. Testing and Maintenance

    • Schedule for regular DRP drills, testing and exercises

    • Procedures for updating and maintaining the DRP

    • Post-incident review and lessons learned

  9. Appendices

    • Contact lists (employees, vendors, stakeholders)

    • System and network diagrams

    • Copies of critical documents and agreements

By developing a well-structured Disaster Recovery Plan that encompasses all the essential elements discussed in this blog post, organizations can significantly enhance their ability to recover from disruptive events and ensure business continuity.

Conclusion

Disaster Recovery Planning is crucial to an organisation's business continuity plan. Understanding the business's needs, risks, and potential impacts is key before creating such a plan. The overall cost and complexity of the disaster recovery plan may vary depending on these needs.

The first step is to document the disaster recovery plan. Regular testing and DR failover during maintenance windows can help the IT team identify gaps in the plan and suggest improvements.

When major infrastructure is changed or added, the DR plan must be updated with the latest backup procedures and recovery mechanisms. A regular test plan should also be implemented.

Businesses should regularly audit their Disaster Recovery plans and DR drill documentation.

References

Business continuity planning

What Is Disaster Recovery? - Features and Best Practices

What is a Disaster Recovery Plan (DRP) and How Do You Write One?