« Things I Have Learned About Foam From the Columbia Accident Investigation Board ReportThree Men and a Container Full of Bath Toys »
08.29.2003

Physics 2, Business Administration 0

"When a program agrees to spend less money or accelerate a schedule beyond what the engineers and program managers think is reasonable, a small amount of overall risk is added. These little pieces of risk add up until managers are no longer aware of the total program risk, and are, in fact, gambling. Columbia Accident Investigation Report, pp 139 One of the most sobering conclusions of the Shuttle accident report is that the Columbia was an exact replay of the Challenger - the same false confidence, the same scheduling and funding pressure, the same lack of attention to an intermittent problem whose causes were never understood. There's even the same badly-designed briefing slide, failing to convey the urgency the engineering team feels, and the same old Edward Tufte on hand to point it out, once the investigation gets into full swing. NASA has no excuses this time. Not only is the organizational behavior identical to Challenger (the normalization of deviance, and an assumption that things are safe unless proven otherwise), but even the mechanics of the accident were familiar territory. It turns out the doomed STS-107 mission had a Döppelganger. In 1988, the Space Shuttle Atlantis was hit by debris dislodged from the top of a solid rocket booster, at almost exactly the same time in the launch sequence as Columbia. The debris crashed against the right side of the Orbiter's belly, gouging out over seven hundred gashes, three hundred of them over an inch long. The impact dislodged one tile completely. By sheer luck, the tile that fell off happened to sit on top of an unusually thick aluminum plate, part of an antenna housing. Atlantis made it back safely, suffering structural damage, but with its hull intact. The difference in response between the 1988 and 2003 incidents tells you everything you need to know about NASA:

After the discovery of the debris strike on Flight Day Two of STS-27R [Atlantis], the crew was immediately directed to inspect the vehicle. More severe thermal damage - perhaps even a burn-through - may have occurred were it not for the aluminum plate at the site of the tile loss. Fourteen years later, when a debris strike was discovered on Flight Day Two of STS-107 [Columbia], Shuttle Program management declined to have the crew inspect the Orbiter for damage, declined to request on-orbit imaging, and ultimately discounted the possibility of a burn-through.
The reasons for this maddening complacency will be familiar to anyone who has worked in an organization where the suits face off against the geeks. Every computer progammer learns early on that it's counterproductive to show a software demo to managers - if the demo fails, the managers will be displeased, and if the demo is wonderful, they'll probably decide to ship it as-is. "Looks great - let's go with it!". Just try to explain that there's no error handling, or an intermittent bug in the code, or that clicking the 'help' button crashes the GUI. Engineers are trained to ask "what could possibly go wrong?". Managers are, too, but they use the phrase with a completely different intonation. The debris strike on Atlantis was unprecedented, and came hard on the heels of the Challenger explosion. It was easy for managers to see eye-to-eye with the engineers, and recognize the gravity of the situation. But fourteen years and dozens of successful launches later, the memory of Challenger had receded. In its place was a hugely overambitious launch schedule, and all kinds of political pressure to get the International Space Station 'core complete' by an arbitrary 2004 deadline. And there was also the experience of Atlantis and other debris strikes, which by perverse MBA logic had become arguments for the harmlessness of foam impacts. It had happened so many times before, why start worrying about it now? In an organization like NASA, there have to be safeguards to make sure managers can't overrule engineering decisions, or people will die. The board report cites the US Navy's Naval Reactors program and the Air Force's Aerospace program as good role models in this respect - both of have managed to take the teeth out of hideously risky operations (shipboard nuclear reactors and military satellite launches, respectively) by rigorously separating oversight from management. They've done this by turning their organization into a kind of geek sandwich. There is a lower level of engineers to do the design and construction, a middle layer of management to handle budgets and administration, and a top level of oversight geeks with veto power, overseeing the whole enterprise. This setup is terrifying to managers - after all, the oversight geeks get a separate budget, and it's impossible (by design) for the managers to exert any pressure on the engineers. Naturally, it's frightfully expensive (unless you factor in the costs of a major disaster every few years), and a serious blow to the ego of your average suit, who believes that God made managers to have ultimate authority. Jack Welch would not approve. It will be interesting to see if NASA can get the funding and muster the self-discipline to pull such a transformation off. If they do, it will be a nice irony, since the Shuttle is the very embodiment of managers promising something engineers can't deliver.

« Things I Have Learned About Foam From the Columbia Accident Investigation Board ReportThree Men and a Container Full of Bath Toys »

Greatest Hits

The Alameda-Weehawken Burrito Tunnel
The story of America's most awesome infrastructure project.

Argentina on Two Steaks A Day
Eating the happiest cows in the world

Scott and Scurvy
Why did 19th century explorers forget the simple cure for scurvy?

No Evidence of Disease
A cancer story with an unfortunate complication.

Controlled Tango Into Terrain
Trying to learn how to dance in Argentina

Dabblers and Blowhards
Calling out Paul Graham for a silly essay about painting

Attacked By Thugs
Warsaw police hijinks

Dating Without Kundera
Practical alternatives to the Slavic Dave Matthews

A Rocket To Nowhere
A Space Shuttle rant

Best Practices For Time Travelers
The story of John Titor, visitor from the future

100 Years Of Turbulence
The Wright Brothers and the harmful effects of patent law

Every Damn Thing

2020 Mar Apr Jun Aug Sep Oct
2019 May Jun Jul Aug Dec
2018 Oct Nov Dec
2017 Feb Sep
2016 May Oct
2015 May Jul Nov
2014 Jul Aug
2013 Feb Dec
2012 Feb Sep Nov Dec
2011 Aug
2010 Mar May Jun Jul
2009 Jan Feb Mar Apr May Jun Jul Aug Sep
2008 Jan Apr May Aug Nov
2007 Jan Mar Apr May Jul Dec
2006 Feb Mar Apr May Jun Jul Aug Sep Oct Nov
2005 Jan Feb Mar Apr Jul Aug Sep Oct Nov Dec
2004 Jan Feb Mar Apr May Jun Jul Aug Oct Nov Dec
2003 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2002 May Jun Jul Aug Sep Oct Nov Dec

Your Host

Maciej Cegłowski


Threat

Please ask permission before reprinting full-text posts or I will crush you.