Back

Speaker "Kolton Andrus" Details Back

 

Topic

Breaking Things on Purpose

Abstract

Failure Testing prepares us, both socially and technically, for how our systems will behave in the face of failure. By proactively testing, we can find and fix problems before they become crises. Practice makes perfect, yet a real calamity is not a good time for training. Knowing how our systems fail is paramount to building a resilient service. At Netflix and Amazon, we ran failure exercises on a regular basis to ensure we were prepared. These experiments helped us find problems and saved us from future incidents. Come and learn how to run an effective “Game Day” and safely test in production. Then sleep peacefully knowing you are ready!

Profile

Kolton Andrus is the founder and CEO of Gremlin Inc., which provides ‘Failure as a Service’ to help companies build more resilient systems. Previously he was a Chaos Engineer at Netflix improving streaming reliability and operating the Edge services. He designed and built F.I.T., Netflix’s failure injection service. Prior he improved the performance and reliability of the Amazon Retail website. In both companies he has served as a ‘Call Leader’, managing the resolution of company wide incidents. Kolton is passionate about building resilient systems, as it lets him break things for fun and profit.