Safety Messages

Lessons From Challenger

January 04, 2021

This Jan. 28, marks the 35th anniversary of the Challenger accident. The loss of the crew was a tragedy felt by their families, friends and coworkers at the agency, as well as people throughout the world.

The Challenger accident taught us tough lessons and brought forward what have become recognizable phrases: normalization of deviance, organizational silence and silent safety program. Sadly, we learned these lessons again in 2003 with the loss of Columbia and her crew. This shows how vital it is that we pause to revisit these lessons and never let them be forgotten. We cannot become complacent.

In this month's Safety Message, Harmony Myers, director of the NASA Safety Center, discusses the Challenger accident and the lessons it continues to teach us today.

Presentation

Reminders to Keep You Safe

Welcome to the Office of Safety and Mission Assurance Safety Message archive. This page contains Safety Message presentations and related media. While some of these presentations are not NASA related, all of them have certain aspects that are applicable to NASA. I encourage you to disseminate these to your organizations to promote discussion of these issues and possible solutions.

—W. Russ DeLoach, Chief, Safety and Mission Assurance

Deadly Efficiency

American Airlines Flight 191

July 11, 2010

When American Airlines flight 191 began its takeoff run at the start of Memorial Day weekend 1979, everything seemed normal. As it had done on so many previous flights, the McDonnell-Douglas DC-10 roared across Chicago O’Hare’s Runway 32, bound for Los Angeles. This time, however, just as the plane became airborne, disaster struck. The left wing-mounted engine tore away from the aircraft and hurtled to the ground, rupturing hydraulic and electrical lines in the process. For a few brief seconds, the aircraft seemed to climb normally despite the damage, but then to the horror of hundreds of onlookers, the plane entered an uncontrollable roll to the left. Seconds later, with wings perpendicular to the horizon, the aircraft plummeted into a field less than a mile from the runway. This tragedy occurred because a change to a manufacturer’s maintenance procedure to improve cost-effectiveness damaged structure, allowing a design choice elsewhere to leave the aircraft uncontrollable in this unlikely but real scenario. Two-hundred seventy-three people lost their lives that day; their memory has been honored through improved maintenance standards, exhaustive design processes and strong communications across the industry.

Case Study Presentation

Tragic Tangle

Soyuz 1

June 01, 2010

Facing extreme political pressure to regain dominance in the Space Race, the Soviet Union launched the Soyuz-1 spacecraft in April of 1967 as the initial phase of an elaborate spacewalk demonstration. Tragically, rather than bringing prominence to the Soviet space regime, the mission became a sequence of failure. Manufacturing flaws doomed the vehicle from the outset, and Soyuz-1 became the first in-flight fatality of space exploration. The circumstances from which this mission originated — where schedule pressures loomed so large that mission success superseded crew safety — provide many lessons applicable years after the Space Race. Although times and environments have changed since the Soyuz-1 incident, the constants of external pressure, uncertainty and risk live on. A historical perspective of Soyuz-1 shows that our assumptions when managing risk can mean the difference between mission success and failure.

Case Study Presentation

Head-On Collision

Large Hadron Collider

May 02, 2010

Geneva, Switzerland, is the home of the Large Hadron Collider (LHC), the world’s most powerful particle accelerator. More than 10 billion dollars were spent in its design and construction in the hopes that data from the LHC experiments would forge new pathways in our understanding of physics. On March 27, 2007, 12 years after it was approved for construction, scientists placed the LHC through the final stages of pressure testing and encountered a serious failure: a support structure tore loose and lifted one of the 35-ton magnets from its base, spewing helium gas into the LHC tunnel. Investigators found that this costly mishap was due to a mere calculation error — a consequence of disparate development approaches, unfocused training programs, and delayed performance specifications. This incident highlights the fact that standards in design process and well-conducted, documented reviews are critical to an organization’s success.

Case Study Presentation

Mission to Mars

Mars Observer

April 01, 2010

With the successful launch of Mars Observer, NASA once again set its sights on the Red Planet, a decade and a half after the successful Viking program. Mars Observer, designed to study the geosciences and climate of Mars, was the first mission of the Planetary Observer series, which was envisioned as a series of low-cost missions to the inner solar system. Unfortunately, it also turned out to be the last. After an eleven month journey through space, NASA lost contact with the spacecraft, only three days before the scheduled orbital insertion. With no physical or telemetry evidence to investigate, the Mission Failure Investigation Board faced a significant challenge. What they eventually learned were valuable lessons in the importance of testing, the potential consequences of making tradeoff decisions and the absolute necessity of a functional risk management process.

Case Study Presentation

Island Fever

Three Mile Island

March 07, 2010

On March 28, 1979, a nuclear power plant reacted in a manner that was incomprehensible to plant operators. Widely considered to be the most significant nuclear power plant accident in the United States, the Three Mile Island accident began as a simple mechanical failure: a failed water pump. How could such a seemingly insignificant event lead to a near meltdown? After a number of state and federal investigations, the answer became clear: the simple mechanical failure was only the first of a series of failures that were only made worse by poor decisions based on misinformation. What we’re focusing on this month is that the fact that this accident could have occurred should not have been a surprise. As you read, note the similarities between the complex systems of a nuclear power plant, and the tightly-coupled systems we use at NASA.

Case Study Presentation

Wire to Wire

Swissair 111 Crash

February 07, 2010

Sept. 2, 1998 — On a seemingly normal trans-oceanic flight from New York to Geneva, the cockpit crew of Swissair 111 smelled smoke. Over the course of just 21 minutes, an inaccessible onboard fire intensified to cause system failures that were non-recoverable, and ultimately caused the plane to crash into the Atlantic Ocean, just south of Nova Scotia. All 229 people on board lost their lives. Five years later, Canada's Transportation Safety Board found that engineering defenses of materials selection, cabin design and wiring placement had lacked sufficient testing. Further, administrative defenses such government oversight of standards and cockpit procedures had not prevented fire hazards from appearing in a location thought to pose minimal fire risk. Safety-by-design faces challenges when new subsystems with new functions are added later (in this case a complex inflight entertainment system), and adherence to safety requirements must be entrusted to a far-flung network of people. Today, NASA faces real challenges with respect to gathering and implementing human rating requirements for new space hardware systems built in-house and by commercial vendors. This story just hints at the problems to be faced.

Case Study Presentation

Down to the Wire

Freedom Star SRB Recovery

January 01, 2010

During a Solid Rocket Booster recovery mission, a mishap occurred on the retrieval ship MV Freedom Star, after a tow wire jumped out of the tow chute. Although a crew member was seriously injured, the outcome of this incident could have been much worse. As you will see, this case study highlights the importance of identifying and adhering to safety controls, and also looks at the unintended consequences of failing to manage change within a high-energy system. As is the case in most accident investigations, it was fairly easy to determine how this incident happened. As an organization, though, we’re more interested in taking a look at the conditions that were present to allow the incident to happen. As you read, notice how a thorough safety analysis and hazard identification may have changed this outcome. Additionally, look at the effects of taking on multiple monitoring, operational and decision-making tasks, and how cumulative fatigue — something we all may suffer from, from time to time — may degrade human performance.

Case Study Presentation

Fire From Ice

Valero Refinery Fire

December 07, 2009

"If you don’t use it for a year, toss it out." Forgotten items stored in closets, basements, attics can be easily recycled over time. But what if you don’t realize that a dormant system in the house itself hides a safety threat? Valero’s McKee Refinery Propane Deasphalting (PDA) Unit posed a hazard that no one saw coming. A control station in the PDA unit was shut down in 1992, and rather than remove or purge the idle subsection, the refinery simply closed the valves around the section and left it in place for 15 years. Cold February 2007 temperatures froze trapped water in the unused propane piping, cracking it. Leaking propane quickly ignited and set the refinery ablaze. Responders battled the intense fire but had to evacuate when flames engulfed local propane shut-off valves, defeating efforts to isolate the fire. Aging NASA facilities bear silent witness to past science and engineering achievements spanning two centuries; some infrastructure awaits renewal, some awaits demolition. Conversely, new designs for ambitious missions far from Earth must be able to withstand and adapt to unplanned events. In both cases, the capacity to detect and correct for emerging threats will spell the difference between loss or sustainment, failure or achievement. Read this month’s Case Study and consider latent sources of violent energy release, and our control over them.

Case Study Presentation

The Tour Not Taken

Comet Nucleus Tour (CONTOUR)

November 01, 2009

No one knows for certain why the signal from the Comet Nucleus Tour (CONTOUR) spacecraft disappeared on Aug. 15, 2002. CONTOUR was a Discovery Program mission that was left almost entirely in the hands of NASA contractors and academic partners. It was a mission lost because of poor communication, misguided assumptions and models that were not faithful to reality. The project team had no experience with the Solid Rocket Motor they used, and they relied on the SRM manufacturer and a consultant for expertise. The team also vested much in the strong success record of the particular model they used, but previous designs did not match CONTOUR's specifications. In this study of CONTOUR, look for ways to consider if your project is leaning on insufficient supports or depending on past performance.

Case Study Presentation

A Half-Inch to Failure

Minneapolis Bridge Collapse

September 01, 2009

When a major highway bridge in Minneapolis, Minnesota, fell on Aug. 1, 2007, the nation was shocked. Speculation about the cause of the collapse was widespread in the following days. Was it metal fatigue? Corroded steel roller bearings? A terrorist attack? Fifteen months after the disaster, the National Transportation Safety Board released an extensive investigation report. The proximate cause (gusset plate failure) surprised engineers and a public familiar with other bridge failure modes. The riveted metal plates connecting structural beams were assumed to be stronger than the beams themselves, but were in fact too thin to withstand years of ever-increasing loads. NASA may not build many bridges, but the I-35 story teaches us that hidden hazards can lurk within aging structures.

Case Study Presentation

1 2 3 4 5 6 7 8 9 10 ...