Mastering Disaster Recovery - Part 3 : Business Continuity and Disaster Recovery Planning
In my quest to master disaster recovery, I discovered that backups are a crucial part of any disaster recovery plan. But what exactly is disaster recovery planning?
In this blog, let's explore what a disaster recovery plan is, why we need it, and how it contributes to an organisation's broader business continuity plan.
The objective is business continuity
Business continuity is the capability of an organisation to continue operations of products and services following any disruptive events. These disruptive events include natural disasters like earthquakes, floods and solar storms; other events like terror attacks, arson, security breaches and data breaches. Disruptive events can also be socio-economic factors like global recession, trade deficits, and loss of customers.
Organisations plan and create systems to prevent and prepare for recovery from any of the above mentioned threats. The plan enumerates a range of disaster scenarios and lists the roles, responsibilities and steps to recover regular trade. The key focus is on preparedness, protection, response and recovery strategies.
While business continuity focuses on the overall resilience of an organisation, disaster recovery planning is a crucial subset that specifically addresses the restoration of IT systems and technology operations.
Disaster recovery planning is a subset of Business Continuity
Business continuity (BC) and disaster recovery (DR); often collectively referred to as BCDR, are two related but distinct approaches to ensuring an organisation's resilience. BC is a proactive strategy that focuses on minimising risks and ensuring the organisation can continue to operate and deliver products and services during a disaster event by defining ways for employees to continue their work. On the other hand, DR is a reactive subset of BC that concentrates on the specific steps needed to resume IT systems and technology operations after a disaster occurs, and is implemented only when a disaster actually strikes.
To develop an effective disaster recovery plan, it is essential to first understand what the business needs to protect. This involves creating an inventory of critical assets, identifying stakeholders, and cataloging procedure and business process documents.
Understand what the business needs to protect
Create an Inventory
What do we need to protect?
- IT Equipment: Catalog all hardware, software, and network infrastructure, including servers, workstations, routers, and storage devices.
Who are we dependent on?
Contractors: List all external contractors and their roles in maintaining or supporting the organisation's IT systems.
3rd Party Vendors: Identify all vendors providing essential services, products, or support to the organisation's IT operations.
Where do we keep data and where do we backup to?
Primary Sites: Document the main locations where business operations and IT systems are housed.
Recovery Sites: Identify alternative locations that can be used to resume operations in the event of a disaster.
Identify Stakeholders and key responsibilities
Who are the decision makers?
Executive Management: Support, approve, and communicate the disaster recovery plan; make critical decisions during disasters.
Business Unit Leaders: Collaborate with IT to identify critical processes; provide input for BIA; develop business continuity plans; ensure team awareness; participate in testing and training; coordinate during disasters.
Who will implement and maintain it?
- IT Department: Develop, implement, and maintain technical aspects of the plan; prioritise IT system recovery; ensure data integrity and availability; conduct testing and updates.
Catalog Procedure Documents and Business Process Documents:
Procedure Documents: Create and maintain a comprehensive set of documents detailing the step-by-step procedures for critical IT operations, such as system backups, data restoration, and emergency response protocols.
Business Process Documents: Document all essential business processes, including their dependencies on IT systems, to ensure a clear understanding of how technology supports the organisation's operations.
Once the critical assets and stakeholders have been identified, it is crucial to collaborate with the business to determine the potential impact of a disaster. This involves conducting a Business Impact Analysis (BIA), Threat and Risk Analysis (RA), and Impact Analysis.
Collaborate with Business to decide what is the impact of a disaster
Business Impact Analysis (BIA):
Identify and prioritise critical business functions and processes.
Determine the potential impact of disruptions on each function or process, including financial losses, reputation damage, and regulatory consequences.
Establish recovery time objectives (RTOs) and recovery point objectives (RPOs) for each critical function or process.
Threat and Risk Analysis (RA):
Identify potential threats to the organisation's IT systems and operations, such as natural disasters, cyber-attacks, and equipment failures.
Assess the likelihood and potential impact of each threat.
Develop strategies to mitigate or eliminate identified risks.
Impact Analysis:
Evaluate the consequences of disruptions on the organisation's overall operations, including the impact on employees, customers, and stakeholders.
Identify the interdependencies between various business functions and IT systems.
Determine the resources required to maintain critical operations during a disaster event and to recover from disruptions.
With a clear understanding of the potential impact of a disaster and the critical assets that need protection, the organisation can finally build a comprehensive Disaster Recovery Plan (DRP). The DRP should outline the scope, objectives, and general contents necessary for effective disaster recovery.
Finally build the Disaster Recovery Plan
A disaster recovery plan is a document with information about how to resume operations from any disruptions. The disaster recovery plan mainly consists of an organisation’s IT infrastructure. The goal of the disaster recovery plan is to minimise data loss, recovery time and ensure system integrity and availability is returned to an acceptable level.
With the BIA, RA and impact analysis reports, we can identify the the impacts of disruptive events and sets the context for RPO and RTO objectives.
Scope of a DRP
The scope of a DRP encompasses all critical IT systems and infrastructure that support the organisation's core business processes.
Objectives of the DRP
Minimise downtime and data loss during a disaster event
Ensure the timely restoration of critical systems and applications
Maintain the integrity and availability of data and systems
Provide clear guidance and direction to staff involved in the recovery process
Comply with regulatory requirements and industry best practices
General contents of a disaster recovery plan
Introduction
Purpose of the DRP
Scope of the plan
Objectives of the plan
Roles and Responsibilities
DRP team members and their contact information;
Roles and responsibilities of each team member
Incident Response
Incident detection and reporting procedures
Incident classification and prioritisation
Communication plan (internal and external)
Inventory
Inventory of critical IT systems and infrastructure
Identify backup tools and mechanisms for different workloads and infrastructure
Primary Site information
Secondary Site information
Off-site Backup Location
Business Impact Analysis (BIA)
Identification of critical business processes
Recovery Time Objectives (RTOs)
Recovery Point Objectives (RPOs)
Prioritisation of recovery efforts
IT Systems Recovery Procedures
Backup and data replication procedures
Step-by-step recovery procedures for each critical system
Vendor and Third-Party Coordination
Contact information for key vendors and third-party service providers
Procedures for coordinating with vendors during a disaster
Testing and Maintenance
Schedule for regular DRP drills, testing and exercises
Procedures for updating and maintaining the DRP
Post-incident review and lessons learned
Appendices
Contact lists (employees, vendors, stakeholders)
System and network diagrams
Copies of critical documents and agreements
By developing a well-structured Disaster Recovery Plan that encompasses all the essential elements discussed in this blog post, organizations can significantly enhance their ability to recover from disruptive events and ensure business continuity.
Conclusion
Disaster Recovery Planning is crucial to an organisation's business continuity plan. Understanding the business's needs, risks, and potential impacts is key before creating such a plan. The overall cost and complexity of the disaster recovery plan may vary depending on these needs.
The first step is to document the disaster recovery plan. Regular testing and DR failover during maintenance windows can help the IT team identify gaps in the plan and suggest improvements.
When major infrastructure is changed or added, the DR plan must be updated with the latest backup procedures and recovery mechanisms. A regular test plan should also be implemented.
Businesses should regularly audit their Disaster Recovery plans and DR drill documentation.
References
What Is Disaster Recovery? - Features and Best Practices
What is a Disaster Recovery Plan (DRP) and How Do You Write One?