Model: Blameless Postmortem
November 22nd, 2020
postmortem has 3 important jobs
explain what happened
commit to improvement
anomalies and root cause
we look for causes, and any anomaly either get's labeled as a root cause or a contributing factor
many times these anomalies are present during "ordinary operations, too.
We give them more weight that they deserve
anomalies are present all the time
learn from "near misses"
eg. type an incorrect command, but catch it before executing
how was it caught?
what safety net could have helped
prevent it from doing harm