Pattern: use SLA to determine when to start load shedding
August 16th, 2022
turn away load we can't complete in time
Where? Load balancers
How? Health check + SLA
a good health check on the first tier of services can inform the load balancer when response times are too high (in other words, higher than the SLA)
Own response time
services can measure their own response time to help with this.
They can also check their own operational state to see if requests will be answered in a timely fashion.
For instance, monitoring the degree of contention for a connection pool allows a service to estimate wait times.
Dependency response times
Likewise, a service can check response times on its own dependencies. If those dependencies are too slow and are required, then the health check should shot=w that this service is unavailable.
What? Http 503
This post was referenced in: