Letter to the NASA Space Shuttle Team From Wayne Hale on Risk
This week, while most of you will be taking a day to study and think of ways to be safer, I will be in Cleveland attending the NASA Risk Management Conference. It is my intention to understand this process better and apply it in program decision making as my personal contribution to improved safety in the Space Shuttle Program.
Many of us are struggling with the concept of accepting risk – how much is too much, how much is an inherent part of what we do – and I would ask each of you gives this topic some consideration during your Safety Day activities. The following are some of my thoughts on this subject – not direction nor even an exhaustive process on how to come to the best answer – but just some thoughts based on my experience in 26 years of shuttle experience. I can only write to you about what I know, and you must know by now that when I write to you it is from the heart.
Much of my experience comes from the 15 years I spend as a Flight Director. The Entry Flight Director is assigned the deorbit decision. After the payload bay doors are closed, the burn targets loaded in the computer, all the systems checked, and the last weather report is in, the clock counts down to ignition and the crew waits to hear the Flight Director’s decision: Go or No-Go. The Flight Control Room is always silent. Usually the orbiter is in flawless condition, rarely there is a minor systems problem, nothing significant. But there is always the weather forecast. The weather forecast – for a precise place at a precise time just about 2 hours in the future. The orbiter flies like a brick with handling qualities that would make a Mack truck proud. The commander has one shot at landing, there are no do-overs. It’s the Flight Director’s duty to make sure that one shot is a good shot.
Funny thing, it is never a black or white decision; it is always gray. There are always concerns, the chance for an adverse change, indicators of what might go wrong. When you look at weather under a microscope, it is never perfect. In fact, the harder you look the more little counter-indicators you see. The real question is not whether the weather will be perfect, but will it be good enough. The indicators may be gray but the decision is black and white. Binary. Go or No-Go. And if the decision is No-Go, everybody knows the shuttle can’t go around forever. We will be back to make the same decision tomorrow. So the managers in the viewing room watch and wait and second guess. The flight controllers strain to listen for an indication of what the answer will be. The convoy team fidgets. One person must make a decision, give the word for the record, live with the consequences.
I have given the Go 28 times. Every time was the toughest thing I have ever done. And I have never ever been 100% certain, it has always been gray, never a sure thing. But the team needs to have confidence that the decision was good. It is almost a requirement to speak the words much bolder than you feel, like it is an easy call. Then you pray that you were right.
We have done everything practical to mitigate the risk. The meteorologists train every day making real forecasts at the landing sites and checking two hours later to see if they were right. Their statistics are impressive: if they forecast a Go, it turns out to be Go about 97% of the time. But there is that 3% to consider. The weather rules are reasonably conservative, based on conditions that the astronauts have been able to handle in the simulators, in the shuttle training aircraft. But we cannot simulate the transition from zero gravity to one-g; and the stress of doing it for real with no chance to do it again, and with the whole world watching; all that is tough to get past. Even in the aerodynamics, proving that the machine is controllable, and in structural loads, how well the vehicle will hold together, there are limits to the uncertainties that are acceptable. If we have analyzed wrong, or the aerodynamics exceeds what we expect, . . . well.
When our predecessors invented the shuttle, based on their aircraft test experience and previous space programs, they set up a standard that everything should work properly in the face of 3 sigma environmental deviations and 3 sigma systems dispersions. Any basic statistics course will tell you that mean + 3 sigma covers 99.7% of the cases. But there are 3 chances in 1000 that aren’t covered. Why not? Because to try to cover everything – worst on worst on worst – would require a vehicle design that probably is too heavy to get off the ground, and would require a set of proof testing that would take a lifetime to accomplish, and would cost, well, way more than we can afford. So inherently there is risk in using this system. And don’t forget, that is the risk that we understand, that we have designed against, that we have good numbers for. What we don’t recognize or cannot quantify is out there as an “unknown unknown”.
There is nobody in the Shuttle Program Management, or the Agency management that has any delusion that we can reach perfection. Our collective job is to understand the risk, mitigate it as much as possible, communicate accurately all round about the risk remaining, and then decide if we can go on with that risk.
Right now we are working through many issues that are not black and white. There are many options, many shades of gray. There is always a debate about whether we have done enough, whether we have done too much, whether it is good enough. It has always been part of the engineer’s job to determine when enough has been done; not to overdo or to make it so conservative that it takes forever or is impractically uneconomical, or too heavy to get off the ground. Knowing when we have done enough is the art of engineering.
I have heard a well respected, retired senior NASA official speak on several occasions lately. One of his themes is that we are in an inherently risky business, but accepting risk does not mean not testing or not doing analysis. That is not risk acceptance, that is gambling. The real art is knowing when the testing is adequate and it is time to decide and move on. The newly reinvigorated engineering culture at NASA is to return to our roots and make decisions based on knowing — not perfectly knowing, but adequately knowing — what we face.
Every day people make little Go/No-Go decisions. The chief of the wind tunnel team signs off the latest test data to be complete and accurate. He has to decide if there have been enough runs to validate the data. Once he does, we use that data to make critical decisions. An engineer, knowing that there is never unlimited time nor unlimited resources, makes a decision about how many tests it takes to prove a new design. Another Go/No-Go decision is made.
When the OPF technician stamps the work document that the bolt has been tightened to its torque specification, a Go has been given. When the torque specification was written into the work document and the tech writer signed the document, the signature says Go. When the engineer who designed and tested the part calculates the proper torque value and signs the drawing, there is another Go. The engineer, the writer, and the tech may never meet face to face, but they have to trust each other that each one has done his job right. The Shuttle program is a big organization; there are over 20,000 of us. These days the whole agency is working to help us return to flight. It is impossible to know everyone, but we have to trust that everyone is doing their job right, making sure no mistakes have been made anywhere. Your signature or stamp will be rolled up into your manager’s signature on the Certification of Flight Readiness. Next spring, Bill Readdy, the chairman of the Flight Readiness Review, will read the poll, and all the managers will say Go. Their Go is based entirely on what decisions you have made.
There are jobs in the world where the calls don’t have much consequence. Nobody in this agency has one of those inconsequential jobs. It may not seem that a financial call on a budget line item could be a Go/No-Go decision. But frequently that becomes the critical decision that determines the difference between success and failure. It may not seem that a personnel action could be a Go/No-Go decision. But having the right person in the right place with the right training and experience is paramount. These are critical decisions in our profession, equal to the more obvious engineering decisions.
And behind every decision, everyone knows that we have neither the time nor the resources to do anything that is not absolutely critical to the safe return to flight. We have no time for fluff, no resources for nice-to-haves. Choices have to be made daily, between what must be done, what has to be done, and what can be eliminated because it is not required to be done.
The Flight Directors and CapComs always visit the crew in quarantine just after the last sim, just before the crew flights to the cape. We have trained together, sat in hours of meetings next to each other, laughed at each other, gotten angry with each other, and have undergone the great testing of simulated flight with each other. At the meeting, we cover the last minute changes and reminders, check on the wakeup calls and when to send the morning mail, and make bad jokes. At the end, we have the Ritual of The Handshake. Everybody has to shake everybody else’s hand before we leave. We look each other in the eye and say ‘Good luck’. They always say ‘We’re looking forward to a great flight’. Nobody ever talks about . . . you know.
But we all know.
There is risk about to be taken, serious risk that can have ultimate consequences. Humankind collectively does not know enough to scientifically drive the risk of space flight to zero. A hundred years would not provide enough time for all of us working together to positively eliminate any risk. Ten thousand small decisions throughout the preparation for the flight have been made, each with underlying risk calculations, and that total risk has accumulated and communicated upward. Everybody has done their best to make it perfect, but there is a limit to what can be done. That is what we know. And we also know that the risk of not going is infinitely worse; the consequences would be worse if we didn’t try than if we try and fail.
Sometimes it is exquisitely clear when you are having a Significant Conversation. After we close the quarantine door and walk to the parking lot there is never any conversation. It is always a silent walk. I’ve had that walk 40 times. Flight Directors know too much about the risks.
Senator McCain has written a book entitled “Why Courage Matters”. You may not agree with his politics, but the senator’s credentials concerning courage beyond dispute. He says that we have watered down the meaning of courage. An athlete’s prowess on the field of play is not courage, he says. Suffering an illness or injury without complaint is not courage. Being outspoken in a culture of silent acquiescence to certain wrongs is not courage. These are all evidence of virtue, the senator argues, but they are not examples of courage. The former POW defines the courage as acts that risk life and limb to uphold a virtue. And he quotes Martin Luther King, Jr.: “If a man hasn’t discovered something he will die for, he isn’t fit to live.”
Everybody knows that there are ultimate risks in space flight. Some among us believe so strongly in the benefits that they put their lives on the line. Others of us believe so strongly that we do something harder to live with: we send our colleagues into danger. Why should we do it? Because the consequences of not taking the risk are unthinkable. The choice of turning back and giving up would affect the rest of history in ways that are immeasurable. Somebody recently said that what we are engaged in is like high stakes poker. That comment trivializes space flight to a parlor game where the only risk is money or pride or career or other cheap consideration. To push back the frontier incurs a price that sometimes must be paid in a currency more dear than mere dollars. It takes courage.
It was Christmas break and the parking lot at JSC was almost deserted. After 15 years as a Flight Director, I was days away from moving to my new job at KSC and had come in to finish up loose ends in my old Building 4 office. It was dusk as I walked out into the nearly empty parking lot. K. C. was leaving work too. She greeted me with that megawatt smile she always had. I asked if she was ready to go fly. Her response was enthusiastic: Yes, a couple of weeks to launch and her crew is trained and anxious to fly after long months of delay. Our brief conversation consisted of only happy words. We didn’t talk about risk or danger, only of the rewards to be expected from a successful flight. I wished her good luck and turned for my car and drove home. I didn’t give the conversation any thought until the first day of February.
Sometimes, you don’t know when you are having a Significant Conversation. I know now that K. C. did not understand all the infinite detail of risk that lay ahead of her, clearly none of us did. But I can say without a doubt that she felt what was to be accomplished outweighed the risk that she understood, outweighed it by a lot.
Recently a reporter asked if it would be difficult for me, as chairman of the MMT, to give Mike Leinbach, the Launch Director, the Go For Launch at T-9 minutes. I told him no, that by launch day our procedures and processes would be well polished, the decision criteria all agreed to and documented, and all the really difficult decisions would be behind us. We would just be executing from the checklist and the final Go would be a matter of making sure all the squares were checked. It would be easy.
After thinking about that for a few days, I realized that that answer is, of course, a lie: under the microscope, nothing looks perfect, and the call will be hard because . … .. you know. Life is full of gray choices. Deciding the work completed is good enough because more will not make it perfect. Ten thousand gray choices; doing what we must do, and not a bit more because that would take away from other work that is absolutely critical to be done right. When we have done what we can do, when we have driven the risk to the lowest practical level where it can be driven, then we have to accept the fact that it is time to make a decision and move on. Because history is waiting for us. But history will not wait forever, and it will judge us mercilessly if we fail to face tough choices and move ahead.
During the countdown, Steve Altemus, the launch NTD, will give many folks the challenge: Say Go or No-Go. You need to imagine Steve standing at your elbow each day asking that question to you because it all rolls up. Each of us has a part. Nobody can be sloppy or careless. Nobody can take forever trying to get it perfect, blocked by indecision or the fear of making a decision. Nobody in our business gets an easy choice. Yours will be a gray decision, too. We owe it to some courageous people to get it right. Don’t waste your time on things that don’t count. Focus on what must be done and do it right and then move on to the next problem to solve. There will be some risk that we cannot control, that we cannot solve, that we cannot eliminate. That risk, we will have to accept. If we have done our job right, it will be worth it.
At the end of the countdown, Mike Leinbach will wish Eileen and her crew ‘Good Luck.’
You will know what that conversation means. That will be significant.