Many major companies have post-mortem reviews for this kind of thing. Most of the big failures we see is a mix of people being rushed, detection processes failing, a miscommunication/misunderstanding of the effects of a small change.
One analogy is rounding - one rounding makes no difference to a transaction, but multiple systems rounding the same direction can have a large scale impact. It's not always rounding money - it can be error handling. A stops at the error, B goes on, turns out they're not in sync.
Which guy is it? The person who pressed the button? The manager who gave that person more than one task that day? The people who didn't sufficiently test the detection process? The people who wrote the specs without sufficient understanding of the full impact? The person who decided to layoff the people who knew the impact three months ago?
One analogy is rounding - one rounding makes no difference to a transaction, but multiple systems rounding the same direction can have a large scale impact. It's not always rounding money - it can be error handling. A stops at the error, B goes on, turns out they're not in sync.
Which guy is it? The person who pressed the button? The manager who gave that person more than one task that day? The people who didn't sufficiently test the detection process? The people who wrote the specs without sufficient understanding of the full impact? The person who decided to layoff the people who knew the impact three months ago?