The Two Metrics You Need

1 · Andrew Kane · April 30, 2015, 7 a.m.
When interviewing candidates for Instacart’s first site reliability engineer, I volunteered to cover monitoring as one of my topics. I’d start by asking “What metrics should we be monitoring?” One candidate gave an answer that astounded me. He said, There are only two things I care about: errors and latency More specifically: the sum of 5xx status codes latency across all requests - average or 95th percentile Both must be measured at the load balancer. Errors include those generated by the ...