Reply to comment:
Nick Hilliard •
Daniel, thanks for the candid update. Never nice to have to admit in public that you had 4 discretely identifiable problems to cause a cascading sequence of failures, but it does happen from time to time. It's also unfortunate that they had such far-reaching consequences in this situation. I hope when you're reviewing the engineering build of this system that you can find ways of inserting fail-safe mechanisms so that if there are failures in future, the consequences will be less serious. Most experienced operators have had to deal with backup failures during critical systems failures. The only solution to this is test, test and test again as part of your DR management framework. This is as unimaginably tedious as it is necessary. Difficult to imagine how HR involvement would help things - I've never noticed that HR personnel had much skill in the area of engineering management. However, I have found that exposing the engineering underbelly of an failure like this to the court of peer scrutiny is an acutely honed means of ensuring that problems like this won't happen again. There are a lot of engineers in this community. We like to dissect things, including failure dissections. Peer analysis can be highly procedurally redemptive. I look forwarding to reading the final report on this incident.