What if the most important pillar of system resilience isn't your architecture, but your culture? Learn to build systems that withstand any software storm.
#1about 3 minutes
The business necessity of system resilience
An e-commerce site failure during a Black Friday sale illustrates how downtime leads to financial loss and why resilience is essential.
#2about 5 minutes
Understanding faults, failures, and tolerance mechanisms
A fault is a latent bug in the code, while a failure is the observable crash it causes, which can be mitigated with fault tolerance and fail-safe systems.
#3about 5 minutes
Navigating the challenges of modern software development
Modern systems face challenges from increasing complexity, evolving technology, and high user expectations, requiring a balance to avoid over-engineering.
#4about 3 minutes
Building resilience across all software stack layers
True resilience requires a holistic approach that addresses the infrastructure, application, and database layers, as well as the crucial human layer of team culture.
#5about 4 minutes
Core strategies for building resilient systems
Key architectural strategies for resilience include implementing redundancy, failover mechanisms, load balancing, and using availability zones or microservices.
#6about 5 minutes
Implementing disaster recovery and secure coding practices
Proactive resilience involves creating a disaster recovery plan through risk assessment and empowering developers to contribute through secure coding practices.
#7about 7 minutes
Using monitoring and continuous testing for improvement
A continuous improvement cycle is driven by monitoring system health, using automated testing to catch issues early, and analyzing failures to learn from them.
#8about 2 minutes
A practical starting point for individual developers
Developers can significantly impact resilience by focusing on core software quality attributes like performance, security, scalability, and maintainability.
#9about 3 minutes
Adopting a proactive mindset for future resilience
The future of resilience lies in a proactive approach, embracing innovations like AI for predictive failure analysis and fostering a culture of continuous adaptation.
#10about 4 minutes
Balancing security practices with system performance
Security and performance are not a trade-off but a balance that must be determined by the specific context and priorities of the system.
#11about 4 minutes
Prioritizing components when designing for resilience
Focus resilience efforts on foundational components like infrastructure and architecture, as these "shearing layers" are the most difficult and costly to change later.
#12about 5 minutes
Communicating the value of resilience to stakeholders
To get buy-in from decision-makers, present a data-driven business case that clearly documents the financial losses and risks associated with poor resilience.
Related jobs
Jobs that call for the skills explored in this talk.
WWC24 Talk - Brenda Romero - Stay: Surviving and Thriving in TechBrenda Romero discusses her tech career journey, overcoming burnout, and inspiring future game developers at WWC24.Here is what she had to say in the video:Hey everyone! Thanks for joining us!Reflections on a Rough YearLast year, I gave a talk about ...
Christina Schaireiter
Why Attend a Developer Event?Modern software engineering moves too fast for documentation alone. Attending a world-class event is about shifting from tactical execution to strategic leadership.
Skill Diversification: Break out of your specific tech stack to see how the industry...
Chris Heilmann
Why Presentations Should Always Work Offline—Even at Online ConferencesWe just finished the WeAreDevelopers World Congress 2025 in Berlin, and I am still recovering from the event. It was a fantastic experience, and I am grateful to everyone who attended and made it a success. As the main moderator of the main stage, I ...
From learning to earning
Jobs that call for the skills explored in this talk.