Donny Nadolny is a developer at PagerDuty. He has been using Java for many years, becoming a Sun Certified Java Programmer (for Java 1.4) even before getting his drivers license, and is always interested in talking about distributed systems.
For three years PagerDuty has run "Failure Friday", a weekly exercise that uses simple failures like killing a process or adding network latency (in our production environment!) to expose problems in our systems and alerting. This talk will share what we've learned in that time: how our fault injection techniques have changed, the best way to get started injecting failures at your company, and how you can use it to improve your software reliability.