Lightning Talk: Incidents are new normal - Kasia Balcerzak
blameless incident review
(1) gather facts
facts
events
everything related to incident
-logs
how our org behaved before, during, after the incident
not: when did you discover
yes: How did you discover…
led to the incident
or made the incident worse
causal factors by themselves are not a problem
incident = when causal factors are combined
eg. “not enough time for testing”
Root Causes
anything that allowed to happen + be ignored
serious incidents are combination
Don’t prevent things from going wrong!
Try to make things go right!
risk management: “Can we handle this when it fails”
building a learning org is the only way to be proactive
failing things are normal
broken production is not normal
never waste a good incident to improve your org/product