Back

Speaker "Dmitry Chornyi" Details Back

 

Topic

Reliability and Resilience Patterns

Abstract

Any software engineer that is responsible for operating software in production has war stories about network partitions, failing dependencies, mid-night pager alerts, and post-mortems. Microservice architectures give us a promise of building systems that can contain failure, degrade gracefully, and remain available—systems that are prepared for production challenges and do not require constant life-support and human intervention. To achieve this we need to learn to build reliable systems out of unreliable components. In this talk we will explore patterns that will help you make your systems withstand many challenges that a production environment throws at them: latency, timeouts, queuing, bugs, resource contention, load spikes, and dependency failures. I will share practical advice on how we use bulkheads, circuit breakers, load shedding, fallbacks, failure injection testing, and other tools to build reliable and scalable microservices. I will also discuss how culture influences our ability to design for failure and operate to learn so that we can build systems and organizations that improve over time, rather than just not degrade.

Profile

Dmitry Chornyi is a Principal Software Engineer at OpenTable where he leads a team that is responsible for over 20 production microservices.