postmortem has 3 important jobs

  1. explain what happened
  2. apologize
  3. commit to improvement

anomalies and root cause

we look for causes, and any anomaly either get's labeled as a root cause or a contributing factor
many times these anomalies are present during "ordinary operations, too.
We give them more weight that they deserve
anomalies are present all the time

learn from “near misses” eg. type an incorrect command, but catch it before executing

  • how was it caught?
  • what safety net could have helped
    • catch it
    • prevent it from doing harm

(src: Notes: Beyond the Phoenix Project (src: Book: release it! - Michael Nygard)