
It’s crucial to remember that the primary goal is to uncover the blind spots in the project.
On December 13th, the team working on a large-scale mobile application project gathered in our Montreal office to tackle a terrible catastrophe (fortunately, a fictitious one). The scenario was as follows:
In 5 hours, our client is set to launch its biggest campaign of the decade—no less! But disaster strikes: AWS services are down across Canada. After conducting an urgent PIA (don’t forget the Law 25 ), the decision is made to migrate the entire infrastructure to the United States.
Although this scenario is far-fetched, the exercise itself is incredibly valuable. In fact, such simulations are increasingly common in the industry. In the rest of this article, we’ll discuss the various benefits and share some tips to help you get the most out of these exercises.
Choosing the Right Disaster
Every system is unique, and the scenario you choose should be tailored to its specific characteristics. A scenario that is too challenging can be demoralizing, while one that is too simple might add little value. Beyond the technological maturity of the project, a critical factor in selecting a disaster scenario is the team composition. If the original creators of the system are no longer part of the team, it’s often necessary to lower the difficulty of the scenario.
It’s crucial to remember that the primary goal is to uncover the blind spots in the project.
Taking Notes
Many organizations have disaster recovery policies that have never been tested. The “game day” presents an ideal opportunity to put them to the test. Throughout the exercise, it’s essential to document pain points and necessary fixes. The aim of the exercise isn’t to solve all the problems immediately but to identify, document, and address them afterward.
Finally, taking notes only adds value if it leads to concrete actions. It’s vital to schedule a post-mortem session in the days following the exercise to calmly analyze the results of the simulation. Ideally, tangible actions related to the project and its policies should be incorporated into the next sprint.
Balancing Speed and Learning
Even if the scenario includes a time constraint, it’s important to make time for learning. For many team members, such an exercise is a unique opportunity to understand how certain system components work or to familiarize themselves with less common concepts in a developer’s daily routine. For instance, during our last exercise, a team member had the chance to grasp the full process of configuring DNS records.
Some moderation may be needed if someone becomes overly competitive and prioritizes speed alone. Conversely, adding some pressure can be beneficial if the team isn’t taking the exercise seriously enough. One of the main objectives remains testing the team’s resilience in difficult situations.
Conclusion
For a relatively low cost, disaster recovery exercises are an extremely valuable tool for any organization that values operational excellence. They help test the team’s resilience under stress, uncover and address blind spots in the project and internal policies, and preserve institutional knowledge that might otherwise fade over time. For these reasons, we’ve decided to include these exercises as an option in the maintenance plans we offer to our clients.
For those curious, the team managed to recreate the environment in 3 hours and 41 minutes, which (barring another catastrophe this year ) keeps us compliant with our 99.95% SLO. This success was made possible through the use of modern technologies such as ECS/Fargate and CloudFormation, along with a robust database backup strategy.
The exercise was still highly relevant! We identified poorly documented Lambda functions, an outdated SQS queue, and several other opportunities for improvement, which will be implemented in the coming weeks.
Other articles




Oct 15, 2024
Enhancing Product Management: Key to Success in Software Development
The distinction between product management and project management is essential for ensuring optimal productivity. It’s not enough to treat them as interchangeable concepts; it’s crucial to adopt a proactive approach to place the right resources in the right places.




.png)
Jun 14, 2024
Recruiting an In-House Team or Hiring an Agency for Developing Your Application?
When embarking on a project as significant and important as developing an application, a crucial dilemma quickly arises: choosing between a specialized agency or recruiting your own in-house team to accomplish the work. One thing is certain, both options present distinct advantages and constraints.



May 22, 2024
Optimizing Synergy with Your Software Development Partner
The digital realm, especially that of custom digital solution development, is constantly evolving—between fast technological advancements and changing consumer needs, it's quite challenging to predict what the future holds for web players.

.png)
May 3, 2024
Simplified Infrastructures for Enhanced Agility
At Thirdbridge, we believe that project-oriented teams deliver superior quality results, and do so more quickly. Given that they are responsible for the entire value creation flow, these teams can increase their velocity by eliminating bottlenecks themselves. Moreover, entrusting end-to-end flow responsibility to our developer teams makes their work even more engaging and motivating.


Oct 29, 2024
AI driving innovation: A new Era for Mobile Apps and User Experience
Artificial intelligence (AI) represents a digital transformation that impacts us all. This rapidly advancing technology, fueled by data analysis, not only enables informed decision-making and reliable forecasting but also allows for the completion of many tasks at a faster pace.


Oct 15, 2024
Pierre-Étienne Bousquet guest of "Les Affaires"
Our president and co-founder, Pierre-Étienne Bousquet, discussed with Jean-François Venne from Les Affaires the significant growth of digital technology in the retail industry and its impact on online sales, which are becoming increasingly crucial for revenue.

Sep 24, 2024
Cybersecurity and Mobile Applications: Choosing the Right Authentication Method
Mobile applications are essential tools that handle personal data, access sensitive information, and are part of our daily lives. However, in an age where the term cybersecurity is on everyone's lips, ensuring the security of these applications and the information they contain is crucial.





.png)

.png)
