William is the cofounder of Buoyant, a startup that specializes in open source reliability software for cloud-native applications. Prior to Buoyant, he was an engineer at Twitter, where he helped migrate the site from a monolithic Ruby on Rails app to a massively distributed microservice architecture.
Modern application architectures are shifting to the "cloud-native"--containerized, multi-service, and orchestrated in environments like Kubernetes and Mesos. In this new world, where cross-service communication is a critical part of application behavior, the requirement for resilient applications becomes a requirement for resilient communication.
In this talk, we introduce the notion of a "service mesh": an infrastructure layer for cross-service communication, designed to handle unexpected load, manage tail latencies, and degrade gracefully in the presence of component failure. We describe an open source implementation called linkerd, a lightweight HTTP router and load balancer built on Finagle and Netty, used in production today at banks, AI startups, gov't labs, and more. We detail linkerd’s multi-layered approach for handling failure (and its pernicious cousin, latency), including latency-aware load balancing, failure accrual, deadline propagation, retry budgets, and nacking. Finally, we describe linkerd’s routing model and show how it can be used for complex traffic shifting strategies, including ad-hoc staging clusters, blue-green deploys, and cross-datacenter failover.