[AR] fault recovery (was Re: shuttle SRBs...)

  • From: Henry Spencer <hspencer@xxxxxxxxxxxxx>
  • To: Arocket List <arocket@xxxxxxxxxxxxx>
  • Date: Thu, 8 Feb 2018 18:28:51 -0500 (EST)

On Thu, 8 Feb 2018, rebel without a job wrote:

The ideal fault recovery system is one that can bring you immediately to a survivable, but mission sub-optimal, state with as little reference to the systems that failed as possible. As dumb as possible is good here, as the goals of a fault recovery system are radically simpler than those of a primary system.

Exactly. For emergency equipment in particular, "better is the enemy of good enough".

System cleverness that might save lives in unusual circumstances must always be balanced against how many lives might be *lost* because that cleverness misjudged the situation and did the wrong thing in other circumstances. Emergencies are almost by definition unusual cases, where things have gone wrong and seemingly-reasonable assumptions might not actually be valid. Past experience indicates strongly that cleverness here is all too likely to kill more people than it saves.

(By the way, this applies to humans as well as machinery. One reason why it is important to *train* for likely emergencies, and to use aids like written procedures and checklists, is that quickly doing one usually-good thing, carefully chosen in advance, is almost always better than trying to figure out exactly what's best for the particular situation. Not only does the figuring slow you down, but in an emergency, when all hell's breaking loose and you don't fully understand what's happening and your adrenalin level is through the roof, your snap decisions are often *wrong*.)

Henry

Other related posts: