Internal NASA Emails Reveal Atlantis Safety Debate

By Craig Covault
December 10, 2007
Filed under
Internal NASA Emails Reveal Atlantis Safety Debate

The following story appears online at aviationnow.com.

Internal NASA emails from the director of Shuttle Safety at the Johnson Space Center and the Shuttle Program Manager show how they struggled with the potential risk to astronauts’ lives in assessing how to proceed with the launch of Atlantis in the wake of engine cutoff (ECO) sensor malfunctions.

Aviation Week & Space Technology obtained copies of these emails, which are reproduced largely in full here on AviationWeek.com to retain the context intended by their authors.

The emails were sent to the team by astronaut Bill McArthur, Jr. who heads the Space Shuttle Safety and Mission Assurance Office at the Johnson Space Center and Wayne Hale, shuttle program manager.

McArthur has launched into space three times on the shuttle, a fourth time atop a Russian Soyuz and he also commanded the International Space Station.

The emails were sent Dec. 7. This was a day after the Atlantis countdown was scrubbed Dec. 6 when two ECO sensors failed and managers debated whether they could proceed Dec. 8 or needed more time.

They ultimately slipped the next attempt to Dec. 9 after setting new flight rules that 4 good ECO sensors would be required.

This was a change to the original Atlantis launch rule that said the vehicle could fly with 3 of 4.

The Dec. 9 attempt was then scrubbed when one ECO sensor failed in the countdown, a situation that two days before, would have been “go” for launch.

The emails indicate that some of NASA’s highest managers believe that the design of the total system would enable safe launch of the shuttle, without use of any ECO sensor system at all.

It is designed to protect against an oxygen rich engine shutdown that would cause a catastrophic explosion.

Hale cites past flight data that indicates that could be very risky and that it is “likely that this system has been unreliable all along.”

He also says in the emails that “post flight reconstructions point to a few flights where it is possible that we were close” to using the ECO system.

“It seems to me likely that we have been flying the entire history of the program with a false sense of security and that we have never had reliable protection from LH2 (liquid hydrogen) low level cutoff,” says Hale. “That is a really sobering thought,” he says.

Hale adds at the end of his email that he is considering ordering other major reviews of space shuttle safety systems that could be subject to “smart failures” missed earlier by engineering reviews.

McArthur told the team that “to respond to yet more difficulties with the system by ‘giving up’ and declaring the system valueless appears inconsistent with NASA tradition. “We fix things that don’t work,” he said emphatically.

The thoughts aired in the emails obtained by Aviation Week reveal some of the thinking that will guide how the program proceeds in what again is a new chapter in NASA’s review of shuttle flight safety philosophy after two accidents killed 14 astronauts.

This time there has been no accident.

The first step in the new process will be the review of ECO sensor test options by a previously scheduled shuttle Program Requirements Change Board (PRCB) at Kennedy Space Center Dec. 11. It is likely to raise a tanking test with Atlantis fueled on Pad 39A equipped with more instrumentation.

The email from McArthur to the team Dec. 7 reads:

“A preface stating what we all know: we expend significant effort developing LCC’s [Launch Commit Criteria] and FR’s [Flight Rules] in a dispassionate (sometimes), low pressure setting.

“The MMT [Mission Management Team} has clear authority to change these and has proven to be extremely effective in making good decisions. With a vehicle fueled on the pad, however, the standard for overriding consensus reached during prior deliberations must be extremely high,” says McArthur.

“I am skeptical that we will reach a consensus supporting flight with 0/4 ECO sensors within the next two days. Most at JSC heard direction today to “find rationale for flight” versus “is the risk of flying without the ECO sensor system acceptable?”. To me, this seems to be a huge leap.

“The 3/4 criteria has been extensively discussed and analyzed. To respond to yet more difficulties with the system by “giving up” and declaring the system valueless appears inconsistent with NASA tradition,” says McArthur

“We fix things that don’t work,” he said.

“Perhaps this system doesn’t add value commensurate with the effort to make it work reliably. But we may be on a slippery slope pushing to resolve that question in such a short period of time.

“That we’ve never had an LH2 low-level cut-off is valuable data but not compelling in itself. Have we ever come close? We’ve never used [a] Crew Escape System, but there are no proposals to trade this equipment for upmass.

“Right now I’m far from being comfortable with flying in our current condition,” McArthur tells the team Dec. 7

After seeing McArthur’s email, Hale circulated both McArthur’s and his own comments to NASA managers.

Hale first addresses two broader issues, including what he perceives as pressure by the news media. He then moves into technical tradeoffs:

1. “It is good to have strong advocates on both sides of complex issues to ensure that all factors are thoroughly examined. So whether today, next week, next month, or next year, I expect that there will still be many folks who will disagree with the decision — whatever that may be — on this topic,” says Hale.

2. “I am extraordinarily aware of the affects of the phenomenon termed launch fever. We are here at KSC or our home centers with a job to do, fly the shuttle safely and successfully, which thousands of folks have labored long and hard to get ready. Extraordinary effort has been made to get the vehicle to the launch pad on time. The payload customer has been waiting years to fly this complex and expensive laboratory. The media is clamoring for us to launch (most launch pressure comes from the media) with brazen headlines in the papers & etc.,” says Hale.

“If we had decided to override the LCC yesterday and press to launch, or if we had come to a hasty conclusion at the end of a long and tiring day yesterday, then I would agree that our judgment had been overly influenced by launch fever or schedule pressure.

“The fact that we stood down for 48 hours to consider our options, to cool down, to gather our facts, to discuss, debate, troubleshoot, review, etc., etc., etc., all of this is clear indication that the team is not allowing launch fever to overly influence us.”

“The proposal that the LH2 cutoff system is unreliable and we should disable it and fly without it is not new,” says Hale.

“There were discussions along these lines in some small parts of the community before the Columbia accident. This discussion came to a serious discussion at the first RTF [return to flight]tanking test in March of 05. At that time, I believed that such a decision would be premature. There is no doubt that a properly functioning LH2 cutoff system protects for some failure cases that would otherwise be catastrophic,” says Hale.

“It behooved the program to expend considerable resources to troubleshoot and perhaps restore functionality to this system, even though the likelihood of needing the LH2 cutoff system was statistically unlikely.

“Every time the ECO sensors have caused a launch scrub, the discussion of the usefulness or not of the LH2 system has been debated, and every time to date there was more work that could be done, so the decision to fly without them was always postponed.

“This is not a new idea. The question, really, is whether the time has come to consider it more thoroughly,” he says.

“So we have spent nearly 3 years doing everything we could think of.

The sensors have been through extensive rework, retest, evaluation, etc., etc. Other than a redesign from scratch, we have now done everything to the sensors themselves that anybody can think of. We have run the sensor/swage improvement trail to its complete conclusion.

“Other than a major redesign effort that would probably require the rest of the program duration to test and qualify, there is nothing left to do to improve the sensors themselves.

“The Point Sensor boxes have been thoroughly examined, reworked, retested, reevaluated, and are in the best possible shape. Other than a complete redesign and requalification we have done everything there is to do to make the PSB work correctly. There is no credible suggestion that we have not implemented. There is no credible troubleshooting that we have not done. And in fact, the PSB is exonerated in today’s anomaly by the new instrumentation that we have put on the vehicle.

“The connectors and wiring have been exhaustively evaluated. Extensive work on the wiring has been done to demonstrate that we have good connections all the way through the system at ambient temperatures. All our ambient temperature troubleshooting has yielded nothing.

“Troubleshooting at cryo temperatures is by definition hazardous and we have limited ways to accomplish that. If there is a credible story that a new tanking test, perhaps with new drag on instrumentation for voltages, resistance, or temperature, would help, we probably will execute that test,” Hale says.

“However, I remain skeptical at this time since the case has not been presented in a coherent manner of what we would test, what data we would find, and how that might be useful,” says Hale.

“Much attention has centered on the LH2 tank feed through [electrical] connector. An extensive series of tests and evaluations is done on each connector.

“After the ET-120 tanking tests, the LH2 feed through connector was removed and subjected to many harsh tests. Nothing was found. The design is relatively simple and robust. What could be done about LH2 feed through connectors or how to better test them to find problems before actual tanking remains a mystery. Again, every credible suggestion has been exhausted.

“So, out of having completed everything that we can think of to do, we must come to the place were we have to consider what it means to fly with an unreliable system. Because it is unreliable. And because there are no more credible suggesions out there of how to make it reliable, either in the short term, or the longer term (less than 2.5 years).

“So why do we need an LH2 cutoff system. Simply put, if you need it and the LH2 tank runs dry with the engines at full power with LO2 still coming in, inevitably a catastrophe will occur,” says Hale.

“Liquid oxygen rich shutoffs are ugly in the extreme. This is a crit 1 situation. And it occurs so rapidly that human intervention is not practical,” he says.

“The system is biased toward a LOX shutdown. There have been three LOX shutdowns in the history of the program, the abort to orbit on STS-51-F caused by faulty SSME redline sensors that erroneously shut down a perfectly good engine, and two cases STS-78 and STS-93, where failures in the system caused off nominal performance in the engines.

“There has never been an LH2 shutdown, although post flight reconstruction point to a few flights where it is possible that we were close, and much is made of the LHS cutoff sensors showing dry on a couple of flights post MECO [main engine cutoff]. I personally find this last evidence not compelling since the fluid remaining in the tank will slosh or rebound as MECO and it is likely that flashing dry would normally occur.

“For many years in the MCC [Mission Control Center], I was part of the team that practiced “engine performance cases”. That must be part of the review. However, it is clear that there are some cases that would result in subtle engine off nominal operations that would eat through the reserves (Flight Performance Reserve and Fuel Bias plus any benefit from Ascent Performance Margin and launching at the in-plane or optimum time), andwould not be detectable by the MCC.

“These cases would be catastrophic,” says Hale.

“Part of the review we need to have before agreeing to launch without a functioning LH2 cutoff system is an examination of those cases, and their likelihood. If the likelihood of encountering one of these invariably catastrophic cases is of the same order as other accepted risks, sayo1 in 300-ish level, then the program may accept that. But we need to review it,” says Hale.

“So, in my opinion, the program has avoided having the discussion of launching without LH2 cutoff system because until now there has always been a credible improvement/troubleshooting path that held hope of restoring the reliability of this system.

“In fact, the last several flights had lulled me and others into thinking we had accomplished our goal. Yesterday’s events proved that the system is unreliable. Having exhausted all the ways we can think of to make it reliable, it is now time to consider whether we can live without it,” he said.

“Final note. It is likely that this system has been unreliable all along,” Hale concluded.

“It seems to me likely that we have been flying the entire history of the program with a false sense of security and that we have never had reliable protection from LH2 low level cutoff. That is a really sobering thought,” he says.

“I am considering issuing actions in other areas where we have safety systems that have never been exercised to see if we can better test and make sure they are functional to prove that we have not been fooled by some smart failures,” says Hale.

“Lets take our time and consider this “old” proposal one more time,” he concludes.

Copyright 2007 Aviation Week and Space Technology.

Award-winning int'l aerospace journalist, extensive planetary mission coverage and manned space simulations.