Scripts to Automate Disaster Recovery on Google Cloud

If there’s one thing that’s certain, it’s Uncertainty.

Source — https://www.information-age.com/disaster-recovery-as-a-service-123481320/

Disaster Recovery (DR) ensures that the primary system state is preserved across at least more than one region. In the event of an unexpected outage or other disaster to the primary system, action may be taken to shift active traffic and requests to the secondary system through a predetermined setup, whose establishment is also part of forming a DR strategy.

A good disaster recovery strategy begins with a business impact analysis that defines two key metrics:

Recovery Time Objective (RTO): How long you can afford to have your business offline.

Recovery Point Objective (RPO): How much data loss you can sustain before you run into compliance issues.

DR patterns are considered to be cold, warm, or hot. These patterns indicate how readily the system can recover when something goes wrong.

In this blog, I am going to provide Failover execution scripts which can be leveraged during DR failover for each DR pattern mentioned above.

Customer scenario : Customer is running a web application on Google Compute Engine (GCE) and database on Cloud SQL. Customer is running a read replica in the google cloud secondary region.

Reference Architecture:

Two tier web application reference architecture

In a cold pattern, you have minimal resources in the DR Google Cloud project — just enough to enable a recovery scenario. When there’s a problem that prevents the production environment from running production workloads, the failover strategy requires a mirror of the production environment to be started in Google Cloud. Clients then start using the services from the DR environment.

Prerequisites:

  • Access to GCP Project
  • Permission to execute script

Pseudo code:

  1. Promote read replica to standalone
  2. Add database connection string information to project wise metadata
  3. Create a new disk from snapshot
  4. Create a new GCP instance
  5. Switch DNS
  6. Start Application

Cold DR Failover Script:

Important Note: test below script in non-production environment first before using it for production

A warm pattern is typically implemented to keep RTO and RPO values as small as possible without the effort and expense of a fully HA configuration. The smaller the RTO and RPO value, the higher the costs as you approach having a fully redundant environment that can serve traffic from two environments. Therefore, implementing a warm pattern for your DR scenario is a good trade-off between budget and availability.

Prerequisites:

  • Access to GCP Project
  • Permission to execute script

Pseudo code:

  1. Promote read replica to standalone
  2. Add database connection string information to project wise metadata
  3. Create a new disk, remove existing disk, attach new disk and boot up
  4. Switch DNS
  5. Start Application

Warm DR Failover Script:

Important Note: test below script in non-production environment first before using it for production

A hot pattern is typically implemented to keep Recovery Time Objective (RTO) and Recovery Point Objective(RPO) values near zero and involves a quick Recovery Point Objective(RPO) and Recovery Time Objective (RTO). HOT DR is ideally used for Mission Critical applications, This is absolutely vital for core financial and banking applications.

Prerequisites:

  • Access to GCP Project
  • Permission to execute script

Pseudo code:

  1. Promote read replica to standalone
  2. Add database connection string information to project wise metadata
  3. Create and force attach disk
  4. DNS switch
  5. Start Application

Hot DR Failover Script:

Important Note: test below script in non-production environment first before using it for production

Conclusion

If your application is deployed on Google cloud and you are on a specific budget to meet those RTO and RPO values, then use these DR patterns!

Risk always exists, whether you plan for it or not. If you don’t, then you are accepting that risk, whether you like it or not.

References

Disaster Recovery Planning Guide https://cloud.google.com/architecture/dr-scenarios-planning-guide
Backup and Disaster Recovery https://cloud.google.com/solutions/backup-dr

Customer Engineer, Network Specialist @ Google Cloud. I assist customers transform their business using Google’s global network and software infrastructure.