Safety Messages

Smoking Is Cool, Right?


Even before NACA became NASA, scientists had confirmed and published evidence of the lung cancer risks associated with nicotine in cigarettes. Yet, its powerful, addictive chemistry continues to fuel a multibillion-dollar American habit that costs lives. 

 It is up to each individual: No one else can supply the necessary willpower to quit. Each of us can choose to summon the willpower, or cheer on others, to achieve freedom from the lifeshortening effects of smoking. 

This month, Dr. Vince Michaud refreshes our recollection of the risks while also providing insight into the effectiveness of various approaches to stop smoking. 

Monthly Reminders to Keep You Safe

Welcome to the Office of Safety and Mission Assurance (OSMA) Safety Message archive. This page contains the monthly Safety Message presentations and related media presented by OSMA. While some of these presentations are not NASA related, all of them have certain aspects that are applicable to NASA. I encourage you to disseminate these to your organizations to promote discussion of these issues and possible solutions.

—Terry Wilcutt, Chief, Safety and Mission Assurance

Smoking Is Cool, Right?

October 02, 2017

Even before NACA became NASA, scientists had confirmed and published evidence of the lung cancer risks associated with nicotine in cigarettes. Yet, its powerful, addictive chemistry continues to fuel a multibillion-dollar American habit that costs lives. 

 It is up to each individual: No one else can supply the necessary willpower to quit. Each of us can choose to summon the willpower, or cheer on others, to achieve freedom from the lifeshortening effects of smoking. 

This month, Dr. Vince Michaud refreshes our recollection of the risks while also providing insight into the effectiveness of various approaches to stop smoking. 

Skin Cancer Prevention and Screening

A Life Saving Survey

June 05, 2017

Each year in the U.S., over 5.4 million cases of non-melanoma skin cancer are treated in more than 3.3 million people. There are more new cases of skin cancer than breast, prostate, lung and colon cancer combined. One in five Americans will develop skin cancer over the course of a lifetime. Melanoma, the most lethal form of skin cancer, can develop from less dangerous types of skin cancer. Although simple screening exams can catch this process before it begins, one person dies of melanoma every hour. We are all vulnerable.

As a member of the National Council on Skin Cancer Prevention, NASA has been working to lower these statistics. The first year on the council, NASA was awarded the American Academy of Dermatology's Golden Triangle Award for our agencywide skin cancer prevention efforts.

Please take time to view this message from NASA's Chief Health and Medical Officer Dr. J.D. Polk and remember to set up a screening exam during your next physical. It’s quick and simple and could possibly save your life.

Human Factors and Risk Management

Safety Message

May 01, 2017
Often, when stories of accidents appear in the news, speculation begins that the hapless operator “made a mistake.” With this assumption, many move on, satisfied the cause has been found — operator error.
A surprising number of industry and government safety programs focus on carrot-and-stick efforts to transform people into reliable performers. However, systems thinking tells us that an error is a symptom of a system that needs change. People generally strive to do the right thing and get the task done efficiently; in hindsight, people apply the error label to mishap-related actions and decisions since they have gained precise knowledge of the bad outcome. What was the context for causal actions or decisions — was it unsafe or confusing conditions?
Finding or compensating for errors is just the beginning of pursuing a change. Why did the error occur? Why did the decision seem to be a good idea at the time? Discovering the factors that shape performance and improving the system or interface will reliably reduce safety risk, compared to counseling employees and placing them back into an unsafe system.

The Dangers of Distracted Driving

March 06, 2017

As a society, we have become reliant on technology both at work and at home. Many of us keep our cell phones on-hand around the clock, believing an instantaneous response is not only wanted, but expected from friends, family and colleagues. Although we all feel inclined to meet these social expectations, it’s crucial that we adjust our behavior when driving and put down or turn off our cell phones until we reach our destination.

NASA policy prohibits use of hand-held devices while driving on NASA property or operating a NASA vehicle. In an attempt to honor this policy and be safer drivers in general, we often turn to hands-free devices; however, research shows that hands-free doesn’t mean risk-free.

I urge you to not use your cell phone while driving, even with hands-free technology. That phone call or text message can wait — your life is more important. 

Apollo 1: Lessons and Legacies

February 06, 2017

This year marks the 50th anniversary of the Apollo 1 fire and the tragic loss of Gus Grissom, Ed White and Roger Chaffee.

I strongly believe that we need to regularly look back at our mishaps and revisit their lessons. Your review of the organizational causes that were common in all three of our major mishaps is our best insurance against repeating those painful lessons.

I encourage you to take time, on this anniversary, to remember the lives we lost and recall what we learned as we moved forward with human spaceflight. Using 30 minutes of your next staff meeting or program control board to discuss this would be a fitting tribute to those that literally gave their lives to our nation’s space program.

Lessons Learned From Apollo, Challenger and Columbia

January 09, 2017

As we approach our Day of Remembrance for the Apollo, Challenger and Columbia mishaps, it's important to recognize that we fell into persistent, systemic behaviors over the decades separating each tragedy.

How do we set precedents today upon which to base better decisions in the months and years to come?

These hard-earned lessons, distilled into rules by Space Shuttle Program Manager Wayne Hale, point us in the right direction moving forward.

Safety of Small Unmanned Aircraft Systems Operations at NASA

Coordinating Requirements for Operation within the National Airspace System

December 01, 2016

It should come as no surprise that small Unmanned Aircraft Systems (sUAS) or “drones” are becoming more popular as NASA research platforms. Their use also has been on the rise in other arenas such as in facilities, protective services and construction. The potential applications of these vehicles are endless.

In August, the Federal Aviation Administration released new regulations that address the intense national interest in civil sUAS operations and provide guidelines for safe operation. Within NASA, NPR 7900.3C governs sUAS usage and recently was augmented to address the heavy increase in sUAS operations within the agency.

If your organization is considering applications that involve sUAS it is a good idea to partner with your center’s Aircraft Flight Operations Office for advice and guidance during planning and procurement actions. Please take the time to review the information in this month’s Safety Message, provided by the Aircraft Management Division, which oversees all sUAS applications at NASA.

Mining Your Federal Employee Viewpoint Survey for Weak Signals

November 22, 2016

In the U.S. Marine Corps, all leaders are asked to do two things: 1) accomplish the mission and 2) take care of your people. Usually, this is followed with “If you do No. 2, your people will take care of No. 1.”

There are a lot of things that fall under “taking care of your people.” Some of the more obvious ones are building unit cohesiveness, providing training and development at all levels, ensuring safe and adequate working spaces, and ensuring your people have the tools and equipment necessary for mission accomplishment.

Another critical part of taking care of your people is establishing a positive “command climate.” One definition of command climate is what life is like within the organization. It is the culture of the unit, the way it conducts its business. The leader of the organization is solely responsible for its command climate. This responsibility includes ensuring capable and competent management exists at all levels within the organization. The Federal Employee Viewpoint Survey (FEVS) offers senior leadership insight into both the performance of individual managers within the organization and the unit’s command climate. Question 17 in the FEVS asks if employees feel they can report an issue without fear of retribution. For NASA, the best place to work in the federal government, the percentage of positive responses to that question is approximately 80 percent. We can interpret that as one out of every five employees telling senior management, through this survey, that the climate in his or her unit needs to be improved. This particular issue is critically important to NASA because of the difficult and challenging nature of our missions. It is vital that managers are aware of any issues so they can evaluate the associated risk to people and missions. A command climate that didn’t encourage or tolerate people bringing up issues played a role in both shuttle mishaps.

I encourage you to mine your FEVS for information on the command climate throughout your organization. Question 17 is a good place to start. As the Aerospace Advisory Panel once reminded us, “It shouldn’t take an act of courage to raise an issue.”

Common Threads Among Catastrophic Mishaps

Lessons Not Learned, Workmanship Shortcomings, Process Control Failures, Failures to Control Ctirical Material Items and Fraud

August 01, 2016

When we look back at major mishaps in government and industry, it's natural to look for common themes. These themes can be extremely revealing and relevant in assessing the health of NASA’s current operations.

For this month's message, Office of Safety and Mission Assurance Technical Fellow for Quality Engineering Brian Hughitt evaluated catastrophic mishaps from various domains in order to identify common Quality themes. Four predominant themes calling for heightened attention emerged: ineffective corrective action in response to precursor events, fraud and unethical behavior, workmanship shortcomings, and material control inadequacies.

The Balance Zone

Preventing Slips, Trips and Falls

July 11, 2016

Ever since centers began recording slips, trips and falls, these incidents have remained the single most frequent cause of injury at NASA. Across the United States, slips, trips and falls are the leading cause of emergency room visits with more than 8 million cases per year, and, perhaps more surprisingly, they are the second leading cause of accidental deaths. Injuries sustained from a slip, trip or fall can lead to a lifetime of pain, not to mention the medical costs and lost productivity.

Kennedy Space Center (KSC) has developed a novel, highly effective approach to prevent injuries and close calls caused by slips, trips and falls, regardless of external hazards. The approach augments traditional efforts to maintain safe work areas and walking surfaces with specially designed balance workout areas called “Balance Zones.” Backed by classroom training and guest speakers, Balance Zones offer a new and interesting way for KSC employees to improve their overall balance and reaction to hazards.

Sharing these types of innovative initiatives across all NASA centers is another way we can work together to reduce injuries from slips, trips and falls.

The NASA Safety Center (NSC) has produced two campaigns to help raise agency awareness of slips, trips and falls, including an article and videos highlighting KSC's outstanding efforts. See these materials on the NSC's Slips, Trips and Falls page

Skin Cancer Prevention and Screening

A Life-Saving Survey

June 06, 2016

Each year in the U.S., over 5.4 million cases of non-melanoma skin cancer are treated in more than 3.3 million people. There are more new cases of skin cancer than breast, prostate, lung and colon cancer combined. One in five Americans will develop skin cancer over the course of a lifetime. Melanoma, the most lethal form of skin cancer, can develop from less dangerous types of skin cancer. Although simple screening exams can catch this process before it begins, one person dies of melanoma every hour. We are all vulnerable.

As a member of the National Council on Skin Cancer Prevention, NASA has been working to lower these statistics. The first year on the council, NASA was awarded the American Academy of Dermatology’s Golden Triangle Award for our agencywide skin cancer prevention efforts.

Please take time to view this message from Dr. J.D. Polk on behalf of the Office of the Chief Health and Medical Officer and remember to set up a screening exam during your next physical. It’s quick and simple and could possibly save your life.

MMS Transporter Fire

Importance of IRT Training

May 02, 2016

What would your project team do if your NASA-owned flight hardware suffered mishap-level damage during transport? Would someone be prepared to assume the duties of an incident commander or NASA Interim Response Team (IRT) lead?

That’s exactly what happened when one of Goddard Space Flight Center’s four Magnetospheric Multiscale Mission (MMS) spacecraft was almost damaged in a close call. After an internal fire that threatened the spacecraft was extinguished, the project safety manager who was traveling with the spacecraft secured the flight hardware, took photos, recorded witness statements and impounded data. Proper safety training and quick-acting personnel can mean the difference between a delay in a testing schedule or the total destruction of flight hardware. In this case, there was minimal impact to the schedule and no flight hardware damage.

Ground Effect

Gulfstream G650 Test Flight Crash

April 04, 2016

On April 2, 2011, four highly experienced, proficient flight test personnel were killed when a Gulfstream G650 crashed during certification testing. Gulfstream was focused on obtaining Federal Aviation Administration type certification by the third quarter of 2011. Among other factors, the crash was traced to incorrect calculation of takeoff and stall speeds and a superficial review of two previous similar near-crashes.

NASA is doing more development testing now than any time since Apollo. We must preserve essential testing against time and budget constraints. Let’s harness this Gulfstream event to make our tests sound and representative of environment, system and human interfaces. 

Lightning Safety

March 07, 2016

Lightning strikes the U.S. about 25 million times each year, killing an average of 49 people. Many more are struck and suffer severe injuries. Many of us may carry inaccurate, preconceived notions concerning what to do when lightning strikes. This month, Steve Cash, director of Safety and Mission Assurance at Marshall Space Flight Center, debunks the myths and shares the hard facts of lightning strikes. He also addresses what we can do to protect ourselves.

The Role of "Heart" in Heart Disease

February 01, 2016

As leaders, we often say, "Take care of your people." Some of the most serious risks we face pay no attention to workplace boundaries. This month, Grant Watson, director of Safety and Mission Assurance at Langley Research Center, shares his personal message of how his parents’ heart disease changed how he regarded his own heart health. May his message encourage you to reflect on your own story and on lowering heart disease risk.

Dissenting Opinions

January 04, 2016

After a mishap or major disaster, it’s natural to ask what we could have done better had we only known about a defect or flaw sooner. Sometimes those who see something before the test begins or the vehicle launches speak up. Sometimes they’re heard. NASA has experienced mishaps and tragedies where individuals within and outside of our agency had technically sound differing views that were never heard by decision-makers.

Although NASA’s process for submitting a dissenting opinion is outlined in NPD 1000.0B, NPR 7120.5E which includes the NASA Spaceflight Program and Project Management Handbook, Program and Project Managers should be actively seeking out dissenting opinions and addressing them in a clear, open and timely manner. This presentation will focus on the dissenting opinion process, identifying what a dissent is and how the process should unfold. Please take some time to thoughtfully reflect on this presentation; after all, it could be one of us whose choice to speak up could save lives at some point.

Administrative Controls for Fire Safety Hazards

December 07, 2015

In the last half of 2015, three separate fire incidents occurred at NASA's Glenn Research Center. All three fires were immediately detected by operators or fire alarm systems and extinguished. Damage was assessed and regular operations resumed within a few days of each incident. Two fires were classified as Close Calls and one was classified as a Type D Mishap.

While response to each fire was excellent, such incidents give us the chance to refine preventive administrative controls by establishing pre-fire plans that address pre-operation checks, maintenance services and rapid detection of incipient fires. Changing weather and holiday leave periods can heighten risk of fire occurrence, while lowering the odds of on-scene employee detection. Some administrative fire prevention controls can even be applied informally to increase home safety. Engineering preventive barriers and firefighting controls are of course essential in the workplace, but administrative controls are important as well.

NASA Aviation Safety

Procurement Quality Assurance

November 02, 2015

Procurement of aircraft parts without specific knowledge and expertise is a significant risk. Each center that operates aircraft currently handles parts acquisition differently, without a standard set of agency-wide processes or procedures. By consolidating aircraft parts purchases at the NASA Shared Services Center (NSSC), we have the ability to standardize the acquisition of quality aircraft parts and services.

However, this is not without its own challenges. While some flying centers will see no impact since they acquire aircraft parts via existing maintenance contracts, other centers possess greater potential issues. Centers no longer have closed-loop systems to ensure that the aircraft parts they are purchasing are the parts that they receive. Non-flying centers may acquire Unmanned Aerial Systems subject to the same risks. If this situation is not addressed properly, we could be increasing risk to our aircraft fleet. Combining the knowledge of aircraft maintenance experts from each center with the NSSC may be the solution to providing parts and supplier assurance.

Workplace Safety on ISS

October 05, 2015

Like any other NASA facility, the International Space Station (ISS) requires regular maintenance and upkeep. Inside the ISS’s dynamic environment, regular cleaning and routine inspections prevent both health and hardware problems. The crew cleans essential systems, work stations and emergency equipment to ensure readiness for use at a moment’s notice. Although the crew is constantly attentive, every six months a crewmember films the entire cabin interior and egress path of the U.S. Orbital Segment so that engineers on the ground can evaluate conditions from a fresh point of view.

If a new perspective is beneficial for the ISS crew, imagine how helpful it can be for you.

Immediately Dangerous to Life and Health

The Cost of Failing To Identify and Mitigate IDLH Hazards

September 08, 2015

On Nov. 15, 2014, four workers died and a fifth was hospitalized after exposure to a 24,000-pound methyl mercaptan leak at a DuPont plant in La Porte, Texas. The leak occurred in a building that was positioned over chemical plant piping, which included a failed valve. The Chemical Safety Board investigation later found that the valve had no documented function and served no manufacturing purpose. The enclosed office space was not designed to be a confined space, and hazard assessments did not identify Immediately Dangerous to Life and Health (IDLH) risks within the space. However, the workers were overcome by toxic gas while doing normal work there.

This month we look at the DuPont La Porte incident and three other incidents where workers were overcome by toxic or oxygen-displacing gases while performing routine work. In each of these situations, the risk of personal exposure to IDLH atmospheres was either not identified or underestimated by management or the workers operating in those environments.

Testing Flight Hardware

Director's Safety Message

August 03, 2015

On May 7, 2007, the composite reflector for the Aquarius spacecraft underwent acoustic testing at the Jet Propulsion Laboratory Environmental Test Chamber. The reflector was damaged by an anomalous test procedure stemming from a deviation from the normal test procedure. Additionally, the test control system software was not up-to-date, and there was no acoustic subject matter expert present during the test. Although Aquarius launched and successfully completed its mission, the test deviation and lack of diligence resulted in a Class A Mishap.

Due diligence is necessary when engineering, operating and maintaining state-of-the-art flight hardware. We test as much as possible in order to assure high rates of success across all of our programs and projects. Testing itself may be viewed as a measure of diligence — perhaps even a luxury during periods of low funding. However, it is necessary that we are thorough and conscientious in our testing procedures as well.

The Value of a Sustained Maintenance Program

A Lesson Learned the Hard Way

July 06, 2015

Maintenance of infrastructure has been a popular topic for news media this past year, with outlets reporting on the degradation of dams, bridges and even the U.S. highway system. At NASA, the challenge of balancing rising maintenance costs and renovating, replacing or repurposing decades-old infrastructure grows with the end of each major program and with flat or declining budgets.

In 2014, the cost of not performing maintenance on a low-risk system became apparent when Langley Research Center’s Transonic Dynamics Tunnel suffered a cooling coil breach and subsequent water intrusion. Due to the unique operating parameters of the tunnel, mitigating the leak was a lengthy and challenging process. Moreover, the inoperative tunnel lost Langley potential testing revenue upwards of $2 million. Although a series of fiscally sound decisions may prevent systematic maintenance in the short run, we must be aware of the long-term risk involved with every system.

NASA and Unmanned Aircraft Systems

Know the Rules

June 01, 2015

NASA projects are flying Unmanned Aircraft Systems (UAS) at an ever-growing rate to complete scientific research, assist other government agencies with emergency response and learn how to safely navigate the National Airspace System along with crewed aircraft. While many weigh just a few pounds, measure flight time in minutes and are limited to line-of-sight control, some weigh thousands of pounds, have international range and utilize satellite-link control. A wide range of potential issues exist.

What are the rules? Who helps researchers and operators understand and follow them? This month's message focuses on SMA requirements for UAS operations in a world of change.

What Role Does NASA Leadership Play in NASA Safety?

Watching for Signals, Keeping the Hunger, Setting the Tone

May 04, 2015

Any time there’s a lull in mishaps or high-visibility close calls, we have a natural tendency to shift focus to other demanding areas (cost, schedule, program risks). This can distract us from noticing the subtle clues that indicate that the next serious incident is about to occur.

Weak signals of danger are always evident, even if we do not notice them over the demands of daily activities. Attentiveness is the single most valuable habit to develop during these lulls. Everyone in the organization can and should stay vigilant to spot these signals and know what to do when they encounter one. In addition, when an indicator is identified, the organization needs to determine if it constitutes an acceptable risk, or if action is necessary.

This month's Safety Message examines situations when signals appeared and were missed, or when a signal was communicated but the risk was not understood.

Backover Accidents

Preventable Tragedies

April 06, 2015

Every year, thousands of children — about 50 a week in the U.S. — are harmed because drivers who were backing up did not see them. These incidents often take place in residential driveways or parking lots. Most of them involve large vehicles, like trucks, vans and SUVs. Most of them involve a parent or close loved one behind the wheel. Please take some time out of your day to reflect on these figures and what you can do to keep your own family and those around you safe.

Return to Flight

The Seven Elements of Flight Rationale

March 02, 2015

This seven-elements approach systematically identifies weaknesses within a given "prove it's safe" argument for flight approval, allowing mitigation options to be discussed. Thus, those with the power to say "yes" to residual flight risk can better understand what is being accepted.

The Cost of Silence

Normalization of Deviance and Groupthink

November 03, 2014

From the Challenger and Columbia disasters, to the Costa Concordia running aground, to the recent ISS EVA-23 event, all have involved normalization of deviance and groupthink. Understanding the symptoms of these two conditions can help us plan for them, avoid them and stop them.

Student and Intern Safety

Lessons Learned

September 16, 2014

It doesn't have to be spectacular to hurt. A lot of people get hurt doing unspectacular things, such as preparing food, cutting across lawns, tripping on floor mats and lifting and moving objects. People also get hurt when they are unfamiliar with their workplace and the associated hazards.

What NASA does is risky. When we bring new people on board, it is imperative that we train them in general safety practices and specific hazards of the job and environment. Managers need to be sure that all employees are aware of what they need to do to work safely. All NASA employees need to know what to do in an emergency and how to report safety concerns. Preventing unspectacular mishaps will help prevent larger and more serious mishaps.

The Importance of Technical Authority

A Message from Administrator Charles Bolden

August 04, 2014

Technical Authorities in engineering, safety and mission assurance, and health and medical all play an important role in NASA's decision-making processes. Hear Administrator Charles Bolden deliver this month's Safety Message on the importance of Technical Authority.

Fall Prevention in Construction

Fall Prevention

June 30, 2014

As NASA builds the future of U.S. space flight, we will not just be building mission hardware and software. We will also be constructing new facilities and modifying existing facilities.

Falls are the leading cause of fatalities and injuries on construction sites. Every year, there are more than 200 people killed and more than 10,000 people seriously injured in falls. We need to make sure that we understand and effectively implement fall protection requirements for our entire workforce. Proper use of fall protection programs saves lives.

Creating a Strong Safety Culture

Lessons Learned

May 27, 2014

Effective in-line checks and balances, healthy tension between responsible organizations, and value added independent assessment are necessary for safe and successful programs. But, in order to create a strong safety culture, we need to move beyond robust technical authority and effective program management.

Creating the right environment is essential. We need to look not just at the way we manage programs, but also at the way we manage people. Establishing trust, creating diverse teams, focusing on engineering excellence, sharing knowledge and recognizing commitment to safety are all necessary for creating a strong safety culture.

Multi-Billion Dollar Deception

Counterfeit Parts

May 01, 2014

Counterfeit parts pose a serious and growing threat to our nation and to NASA. Counterfeiters understand the strengths and weaknesses of the aerospace supply chain process, and they know how to exploit this process for profit. They know who to target to get the counterfeit part entered into the supply system, and how to cover their tracks. To counteract this, NASA has implemented a suite of defenses to guard against the threat of counterfeit parts. By using trusted supply sources, screening parts through the Government-Industry Data Exchange Program, developing rigorous authentication testing methods, educating the NASA workforce and championing legislative/regulatory changes, NASA is working to ensure that counterfeit parts do not impact the safety and success of NASA missions and operations.

ISS EVA 23 Suit Water Intrusion

Lessons Learned

April 07, 2014

Roughly 44 minutes into EVA 23, astronaut Luca Parmitano reported water inside his helmet on the back of his head. The EVA ground team and Luca were unable to identify the water’s source. As Luca continued to work, the amount of water in the helmet increased and eventually migrated from the back of his head onto his face. EVA 23 was terminated early and the crew safely ingressed the airlock.

Close calls like this are a gift. We must use them wisely. Each close call is an opportunity to examine our work with fresh eyes and a renewed sense of urgency. They urge us to think outside the box. They tell us that the obvious answer may not always be correct. They force us to stay hungry and shake off the complacency that comes from past successes. 

The Legacy of Orbital Debris

Orbital Debris and Space Safety

March 03, 2014

The Space Age has opened the frontier of space to humanity, but it has also left a troubling legacy of orbital debris. The problem has reached the point where the general public is now keenly aware of the issue. In all areas of space, NASA has always led the way, and it is no different for the field of orbital debris. What is NASA doing about the situation? How is NASA taking the lead in being part of the solution?

Apollo 1-Challenger-Columbia

Lessons Learned

January 23, 2014

Every year as we get back in the saddle, our Remembrance Day signals us to think back to the tragic events of Apollo 1, Challenger and Columbia as well as the context surrounding them. We’re accountable for learning from not just what went wrong, but how we recovered. As an example, read these rules recorded by Wayne Hale, former NASA Flight Director and Space Shuttle Program Manager. By actually applying such lessons from the past to your current work, our actions can honor our lost crews beyond silent reflection.

A Loaded Magazine

The Honolulu Fireworks Disposal Explosion

July 01, 2013

Following a NASA Office of Inspector General (IG) Review of NASA's Explosive Safety Program, we look to learn from an industry example where poor oversight, a lack of training and a void of industry regulation led to a deadly energetic material accident. With tight U.S. regulations on fireworks manufacturing, it comes as no surprise that the majority of fireworks consumed in the U.S. are foreign imports — with these imports come mislabeled or questionable materials that could prove to be dangerous to the public. Once inspected and seized by law enforcement, these contraband fireworks must be destroyed. The subcontractor tasked with disposing of these fireworks attempted to destroy them without the proper training and was permitted to do so because safety regulations on multiple levels failed to address disposal of hazardous materials like fireworks. As the agency implements its plan to improve its explosive safety program, we can reflect on potential consequences of insufficient barriers and controls.

The Case for Safety

The North Sea Piper Alpha Disaster

May 06, 2013

System safety engineering is embraced at NASA from the beginning of the program/project life cycle to the end. Historically, an assurance model has been the paradigm, expressed at each life cycle stage via oversight or insight into requirements development and compliance.

Assumptions are made to identify critical areas of risk so that advanced analytical tools such as Probabilistic Risk Assessment (PRA) can be reasonably and efficiently applied. This has proven to be a successful technical approach, except when the assumptions themselves miss scenarios driven more by complex social interactions.

We can learn from a sentinel 1988 event in the petroleum industry: the loss of 167 personnel and $3.4 billion damage following fire and explosions on the Piper Alpha offshore oil platform. Design flaws hindering communications, emergency procedures and evacuation conspired with an unfortunate configuration change and deficient work permit process to doom workers. The North Sea oil drilling industry changed dramatically as a result, with new regulations calling for a "safety case"--a compelling set of documents that could prove a drilling system was safe to an acceptable degree. Ever since the safety case concept was developed, the entering assumptions for safe system development and operation could be covered completely and systematically.

The NASA System Safety Handbook, Volume 1 is your source to discover how the NASA safety case, called a Risk-Informed Safety Case (RISC) should be constructed.

Through a New Lens

Apollo, Challenger and Columbia Through the Lens of NASA's Safety Culture Five-Factor Model

March 29, 2013

This last February 1 (Remembrance Day) is not the only time we can reflect upon any of the three NASA human spaceflight mishaps over the last four decades. Looking at these tragic events, Apollo 1, Challenger and Columbia, through the lens of NASA Safety Culture can inspire us to further examine current programs and projects on any day necessary.

Do we continue to enhance our reporting culture while remaining flexible enough to meet new demands? Are we just in rewarding this reporting or do we "shoot the messenger?" Do we learn enough from our close calls to prevent deadly mishaps from occurring? And finally, are we constantly engaged in positively affecting the agency’s approach to safety?

Read, then try looking around you at your organization with an eye toward Reporting Culture, Flexible Culture, Just Culture, Learning Culture and Engaged Culture. What do you see? What’s being done about it?


Survival and the United States’ Most Advanced Fighter Jet

March 04, 2013

A routine training mission came to a tragic end when an F-22A pilot was killed in a crash during a return to base. The frigid night mission, flown out of Joint Base Elmendorf-Richardson near Anchorage, Alaska, demanded the use of night vision goggles and bulky cold weather flight suits and gloves. Although the United States Air Force legal investigation board for this accident deemed the crash to be caused by channelized attention, it also suggested that personal protective equipment (PPE), intended to protect the pilot, may have obstructed movement as he tried to activate an emergency oxygen supply. This tragic loss tells us to consider the usability of our own PPE and the effectiveness of emergency training under real-world off-nominal operations.

Kiloton Killer

The Collision of the SS Mont-Blanc and the Halifax Explosion

January 03, 2013

When NASA activities are planned, our first priority must remain to protect the public and uphold public trust. This trust is achieved by communication between government and the people it serves — a task not without challenge. Secrecy bred of competition and proprietary technology can threaten communication between industry and government points of contact. This creates a barrier to sharing essential safety information, hidden against some other perceived kind of risk. Such information, known to few but not all of those with the need to know, can be termed as an "unknown known." This is the story of a great disaster, the Halifax Harbour explosion of 1917, where a dangerous munitions cargo entered a busy port, unplanned for, and known to few but unknown to key risk owners. An outbound ship struck the explosives-laden French freighter, sparking the largest man-made detonation yet. The sheer devastation made casualty counts difficult: approximately 2,000 were dead and 9,000 injured. Modern emergency planning and relief efforts sprang from this tragic event, the first disaster given extensive investigative treatment that can help us plan better nearly a century later.

Vapor Trap

The Xcel Energy Confined Space Penstock Fire

December 03, 2012

When critical safety requirements for hazardous work are not clearly identified in a project, the risks of prioritizing schedule over safety become invisible to the real risk owners (project managers and operators physically exposed to hazards). If the all-important discussion fails to occur at the risk-owner level, going forward by cutting technical margins can be seen as being efficient from a cost/schedule/quality risk viewpoint. It may also result in a tragedy like the Xcel confined space tunnel fire, where risk owners became blind to latent hazards awaiting nine industrial painters recoating the inside of a hydroelectric station’s penstock tunnel. Failure to mitigate the dangers of flammable thinners in confined spaces resulted in the needless deaths of five of those nine painters.

From Rockets to Ruins

The PEPCON Ammonium Perchlorate Plant Explosion

November 05, 2012

Although the 1988 PEPCON disaster in Clark County, Nevada, killed two employees and had the potential to kill many hundreds more, time and the remote location have distanced us from its lessons. One of the ammonium perchlorate (AP) explosions that day matched the explosive yield of a one-kiloton nuclear airblast and moved Richter scale instruments in other states. Awaiting NASA’s return to flight following the Challenger mishap, stockpiled AP quietly accumulated in storage containers unsuited for the chemical’s massive energy potential. Hot work maintenance was scheduled and performed without understanding the potential risk; when a spark from hot work ignited material covered in AP residue, the lacking fire response systems and procedure was utterly incapable of intervening. Consider if this case motivates checks for accumulating hazardous material at your center, especially if high-energy systems exist or activities will occur in close proximity to potentially dangerous material.

Driving Safely Is Everyone's Mission

NASA Vehicle Safety

October 01, 2012

This month we’ll be discussing in a new way an issue that continues to affect our agency: transportation safety. Motor vehicle accidents accounted for nearly 40 percent of all damage mishaps at NASA locations between 2009 and 2011. Transportation mishaps can result in the unanticipated financial burden of repairs to serious injuries or even death. While the latter examples of that spectrum seldom occur, it is vital that we educate ourselves on the most common situations in which motor vehicle accidents transpire at NASA in order to reduce risk. The entirety of the NASA Safety Center’s informative Transportation Safety campaign is available at (NASA only). Please take a look, because driving safely is everyone’s mission.

"What's Happening?"

The Loss of Air France Flight 447

August 13, 2012

Asked the copilot, as the Airbus A330 that was Air France Flight 447 dropped like a stone toward the dark Atlantic on the night of May 31, 2009. The copilot who was struggling to fly the jet, and the Captain who returned from a rest utterly failed to comprehend the many alerts, tones and instrument cues of an aircraft in a fully stalled state. Erratic airspeed indications from sensors clogged with ice crystals triggered a complex chain of events and conditions that baffled the crew during an agonizing 125-mile per hour free fall from 37,000 feet that lasted over 4 minutes. Others had experienced Airbus airspeed problems and lived; how could this tragedy, costing 228 lives, occur? What can we learn about the design and operator training of our own complex systems? Over three years later, we have the final investigation from the French BEA to examine.

Don't Mess With Excess

Texas Tech University Laboratory Explosion

July 09, 2012

In 2010, a Texas Tech University Chemistry graduate student was seriously injured after an energetic compound he was working with detonated. The student lost three fingers, received burns to his hands and face, and suffered an eye injury. Almost three years prior to this incident, two close calls occurred in the same department-one even in the same research project. Laboratory research by students is ongoing at NASA centers and at college campuses supporting NASA research and education activities. At any given time, hundreds of students participate in NASA on-site activities through education outreach, intern and cooperative education programs. Hundreds of other students and faculty members participate in NASA research grants across the country. These students are often exposed to many of the same potentially hazardous environments as our regular full-time employees. The lessons learned from these events provide NASA with an important opportunity to reflect on and scrutinize our own policies and practices (e.g., Comprehensive Chemical Hygiene Plans, hazard communications and lessons learned programs) and on the barriers to safety that existed at TTU leading up to the incident. Even with the attention that goes with preparing our grant provisions, it is the NASA and contractor veterans working directly with these young, talented employees and faculty members that most influence their safety and health.

Balloon Mishap in the Outback

Nuclear Compton Telescope Balloon Mishap

June 04, 2012

The Balloon Program conducts frequent flights globally for NASA's scientific and technology development investigations, while also serving as an important training ground for tomorrow's scientists and engineers. NASA's scientific balloon activities date from the earliest days of the agency, with over 2,500 balloon missions conducted, and have enabled discoveries of our Earth, the Sun and the universe. The aborted launch of the Nuclear Compton Telescope from Alice Springs, Australia, in April of 2010 called into question the methods used for decades in conducting safe balloon launches. The Investigation team concluded the mishap stemmed from the failure of the launch mechanism, combined with insufficient risk planning, training and safety oversight. NASA program and safety leadership conducted an extensive evaluation of balloon safety processes following the mishap, developed a corrective action plan to address the recommendations from the mishap review, and was given approval to resume flights in December 2010. Since then, the Balloon Program has safely and successfully conducted balloon launches from Antarctica, Australia, Sweden and the United States. Many aspects leading to the mishap could have been prevented by better risk analysis, contingency planning, personnel training, government oversight and public safety accommodations. This mishap shows us the impact of safety procedures, training and communications on mission success.

Counterfeit Electronic Parts

Situation Report

May 14, 2012

This month we're stepping back from large-scale system failures to look at an emerging issue for NASA and the aerospace industry: counterfeit components. Counterfeit components have created an increasing hazard at NASA, impacting project costs, performance and schedules and increasing the potential for mission failures. The attached view graphs present facts surrounding the counterfeiting of electronic components (i.e., circuit boards and computer chips) and inform us on the proliferation of illegal offshore production and international brokerage of non-compliant parts. As the new era of commercial spaceflight begins, NASA must use all available defenses to prevent acquisition or use of counterfeit components. This requires strict enforcement of compliance standards, verification testing, vigilant reporting and awareness of authorized suppliers. Although a challenging task in a globalized business environment, NASA must endeavor to assure quality while fostering technological innovation and set the example for commercial spaceflight programs that will undoubtedly encounter similar challenges.

The Poldercrash

Turkish Airlines Flight 1951

April 15, 2012

According to Charles Perrow, interfaces, whether they be hardware or software, may enable unintentional sequences not immediately visible or comprehensible, presenting complex interactions. Much of the same can be said for tight system coupling where no buffers exist to prevent one input from having an immediate and direct impact on an entire system. Either of these situations present innate hazards; but combined, a catastrophic outcome can become expected, even normal. In the Feb. 25, 2009 crash of Turkish Airlines Flight 1951, complex interactions inherent in outdated automated flight controls were combined with yet another ingredient: social complexity. As Flight TK1951 approached the Polderbaan runway at Amsterdam-Schiphol International Airport that day, shifting interfaces met uncoordinated organizational policy. These effects manifested themselves with disastrous consequences, claiming the lives of nine people and the injuries of countless more aboard Flight 1951.

Out of Line

San Bruno Pipeline Explosion

March 05, 2012

The Sept. 9, 2010, PG&E pipeline explosion that reduced San Bruno, California, to smoke and rumble will remain vivid in the hearts and minds of the community's residents. The incident left eight dead, 58 injured and affected 108 homes. The earth-scarring explosion that had taken so much from those residents provides a terrible lesson to be learned in identifying latent hazards and making organizational choices. As much as we can point out negligence and the folly of "run-to-failure" attitudes in this tragic accident, we must feed our motivation to discover and act on those latent conditions, known and unknown, that lead to errors and unsafe situations. This is a learning culture that we are trying to further foster within NASA.

Good Design, Built Right

Critical Software

February 06, 2012

Ten stories of loss, from communication with aircraft and spacecraft to catastrophic launch failure to the death of hundreds of passengers when an airliner strikes a hill on final approach. What do these stories have in common? Structured, rigorous validation and verification of software requirements, implementation and change management could have prevented them all. Here we will learn of hard-won lessons about the kinds of software defects and defect-defeating options available to project managers, system and software engineers, software assurance engineers...key players in the complex, unforgiving environment where software is needed to control hardware and human risk decisions are always involved.

Safe Anyway

RAF Nimrod XV230 Crash Over Afghanistan

January 09, 2012

In September 2006, 14 members of the Royal Air Force lost their lives during a NATO-led offensive in southern Afghanistan. Though all wartime casualties leave indelible scars on surviving friends and families, the loss of these servicemen bears particular tragedy because they died in a preventable accident. That accident took place because misplaced confidence and false assumptions projected an illusion of safety that care and vigilance might have shattered. This is a story that unfolds in every industry, in every sector. It is the story of an organization that allowed complacency to trump watchfulness and transition to undermine safeguards. NASA faces significant changes now. What steps will we take to sustain a safety culture to carry us through upheaval that transitions can bring?

Porthole to Failure

The Sinking of the Ocean Ranger

December 12, 2011

In 1982, the Ocean Ranger was the largest, most advanced mobile offshore drilling unit in the world. After six years of operation, the massive oil rig proved it could withstand the North Atlantic's most severe storms, and many people described it with one word: unsinkable. All of that would change on Feb. 14, when one overlooked detail — failure to cover a window before a storm — would weaken the oil rig's defenses. The Ocean Ranger's 84 crew members died that night, and the rig itself capsized at around 3 a.m. One of the most tragic details of this disaster is the fact that the crew could have stopped the devastating chain if they had possessed a more comprehensive understanding of system design and intent. We must view the lessons of the Ocean Ranger as a somber reminder that we must strive to exhibit all characteristics of a safety culture no matter how infallible our modern machines may seem.

Trial by Fire

Space Station Mir: On-Board Fire

November 07, 2011

On Feb. 23, 1997, the six crew members on board Space Station Mir were enjoying a rare moment of relaxation when a fire suddenly erupted from the spacecraft’s supplemental oxygen generator. The fire cut off access to one of two Soyuz escape vehicles and filled the space station with thick black smoke. Swift action and teamwork saved the crew, but the incident brought to light several shortcomings in emergency preparation, communication and safety drills. This story provides a platform from which we can discuss the importance of the fundamental issues related to the Mir crisis, which also apply to NASA and other international or commercial aerospace organizations. While many differences separate these entities, all of them should rely on a common culture of safety that places its emphasis upon mission success.

Tough Transitions

STS-1 Pre-Launch Accident

October 03, 2011

Communication lapses lie at the root of many mishaps and close calls and can set off complicated event chains that lead to disaster. One such string of mistakes took place in the weeks preceding Space Shuttle Columbia's maiden voyage in March 1981. A countdown demonstration test had just concluded, and controllers opened the pad area for normal work. The controllers did not know that a hazardous condition — an atmosphere of pure nitrogen — still existed in the shuttle's aft compartment. Without warning signals or other indications of the oxygen-deficient space, technicians entered the area and collapsed just seconds later. Over the course of 15 minutes, six technicians were exposed to the nitrogen atmosphere, and three of them eventually died because of it. This was the third successive time that tragedy struck the inaugural mission of a human spaceflight program. This story illustrates the prevalent and far-reaching effects of systemic safety issues and reminds us of the vigilance required to keep those failure modes at bay.

Loss of Detection

D.C. Metro Railway Collision

August 01, 2011

On June 22, 2009, a shocking accident rocked routine rush hour traffic, taking the lives of nine commuters and injuring dozens of others on the Washington Metropolitan Area Transit Authority (WMATA) railway. That day, the automatic train control system, which determines train speed and spacing, failed to detect the presence of inbound Train 214. As a result, Train 214 ground to a halt not far from the Fort Totten station. Meanwhile, the following train — number 112 — coasted across the rails at the maximum speed of 55 mph. A bend in the track eliminated opportunity for Train 112's operator to observe stopped Train 214 in time, and only seconds after what must have been a horrific realization for its operator and passengers, Train 112 barreled into Train 214 at significant speed. In the wake of this disaster, we find a reverberating lesson that must not fade with familiarity: commitment to safety must be demonstrated at the highest levels, and it must impact every facet of an organization to foster a safety culture that is truly effective.

Communication Aberration

HST Optical Systems Failure

July 10, 2011

After spending 20 years orbiting Earth, Hubble Space Telescope (HST) has recorded more than 570,000 images of 30,000 celestial objects. This data has changed the face of astronomy by helping scientists around the world gain a deeper understanding of the universe, and Hubble is now regarded as one of the most important observatories ever built. Hubble’s origins, however, were far from auspicious. When the finished project launched in April of 1990, Hubble began transmitting severely blurred images of the cosmos, crushing the general expectation that HST would outperform any earthbound observatory. Investigation teams later discovered that Hubble could only record blurred images because its primary mirror had been polished into the wrong shape. NASA was able to correct this error during a service mission three years later, and since then, HST has surpassed its performance specifications. Many of the events leading to the misshapen mirror could have been prevented by better managerial practices, better risk identification and better enforcement of Quality Assurance procedures. Ultimately, however, the HST optical systems failure resulted because managers disregarded evidence of threats to mission success while facing significant schedule and budget pressures. This month, we discuss the importance of assigning clear responsibility, ensuring rigorous documentation and remembering the mission during times of crisis.

Clear the Way!

NASA Slips, Trips and Falls

June 25, 2011

This month we're stepping back from large-scale system failures to look at a pattern of incidents that seem trivial on their own, but taken together, injure many of our employees: slips, trips and falls. In 2010, such incidents accounted for more than 40 percent of NASA's lost time injury mishaps. Of those injuries, 75 percent were falls on the same level, or slips and trips that injured but did not result in a fall. Most injuries took place during normal activities rather than high-risk operations. To tackle this systemic problem, we need to find and mitigate trip hazards when feasible, while keeping ourselves mindful and vigilant of changing conditions, be they our physical limitations or changing environments.

Strayed Spears

Unauthorized Nuclear Weapons Transfer

May 02, 2011

During a missile decommissioning procedure on Aug. 29-30, 2007, airmen at Minot Air Force Base, North Dakota, mistakenly loaded six nuclear warheads onto a B-52 bomber destined for Barksdale Air Force Base, Louisiana. Airmen failed to handle the warheads in accordance with U.S. Air Force nuclear weapons regulations, allowing the warheads to bypass five safety nets and resulting in the unauthorized transfer. This event is considered one of the most serious breaches in the Air Force's positive control of nuclear weapons. After studying this incident, we find that a slow evolution of expedient processes eroded firmly established safety protocols over time. This study reminds us that in applying NASA's broad range of procedures and processes to high-energy systems, it is not enough to comprehend procedural steps — operators and managers alike must understand the rationales behind the procedures to avoid losing sight of safety goals and of the consequences of mission failure.

Got Any Ideas?

Miracle on the Hudson

April 11, 2011

On Jan. 15, 2009, residents and tourists near midtown Manhattan witnessed history. Close to 3:30 that afternoon, a silent airliner glided onto the frigid Hudson River, coming improbably to rest intact and on the surface. Within minutes, amazed passengers scrambled onto both wings and inflated emergency exit ramps, waving to commuter ferries and boats rushing to rescue them. Across the world, television viewers gaped at the unfolding story: bird strikes to both engines caused them to shut down, turning the U.S. Airways Airbus A320 into an 85-ton glider. Without sufficient altitude or airspeed to land on any nearby runway, the skilled flight crew successfully ditched the plane in the river. All had escaped fatal trauma, hypothermia and drowning. The "Miracle on the Hudson," as the event came to be known, is a testament to how solid leadership, systems knowledge and comprehensive preparation enable correct time-critical decisions to adapt and survive.

Vicious Cycle

X-15 In-Flight Breakup

March 07, 2011

North American Aviation developed the X-15 for a program seeking to investigate winged flight and human performance at the edge of space. Pioneering research that would benefit every subsequent U.S. human spaceflight program, three X-15’s made 199 flights. On Nov. 15, 1967, U.S. Air Force Major Mike Adams was scheduled to fly the X-15 on its 191st flight. Major Adams was a skilled and experienced test pilot, and the team expected another successful mission. But when an electrical disturbance coursed through the aircraft, the flight control system was degraded at an unforgiving instant. Major Adams entered history’s first hypersonic spin, followed by an inverted dive. Massive g forces from these events incapacitated him, leaving him unable to eject before the aircraft broke apart high above the desert. A lack of component qualification testing, a degraded flight control system, possible pilot vertigo and misreading a single, deceptive flight instrument all led to departure from controlled flight. This month, we honor Major Adams’ memory by considering this story, especially as designers conceive new commercial vehicles to fly passengers to the edge of space and back again.

Dust to Dust

Imperial Sugar Company Dust Explosion

February 07, 2011

When night-shift employees of Imperial Sugar Company’s refinery in Port Wentworth, Georgia reported for work on Feb. 7, 2008, they had no reason to suspect that a disaster was about to occur. Visitors to the plant on that night, or any night prior, would have found its interior encased in sugar and sugar dust. The residue rested on conduits, covered machinery and coated the floor. The white particulate — inches deep in several places — looked innocuous enough, but it posed an insidious hazard of which many employees were unaware: the dust was highly combustible. The refinery had operated for more than eighty years without a major incident, but that night, everything changed when an explosion near a conveyor belt triggered a chain reaction of violent explosions that devastated the facility and took the lives of 14 workers. As is too often the case in events such as this, inadequate training and incomplete emergency preparation were among the factors leading to the tragedy. The Chemical Safety Board, which investigated the accident, also cited the Normalization of Deviance as a direct cause. Analyzing this case emphasizes the importance of guarding against complacency, maintaining strict safety standards, and cultivating a culture of preparedness.

Fire in the Sky

TWA 800 In-Flight Breakup

January 09, 2011

On a hot July day in 1996, a Boeing 747 carrying 230 people departed New York’s John F. Kennedy International Airport on a flight to Paris, France. The aircraft experienced an uneventful takeoff and initial ascent, but only 12 minutes into the flight, a sudden and catastrophic explosion in the center wing fuel tank tore the fuselage apart, raining debris into the Atlantic. All passengers and crew members lost their lives. National Transportation Safety Board (NTSB) investigators needed four years to retrieve the wreckage, reconstruct the aircraft and determine the probable cause. In its official report, the NTSB concluded that excess energy entered the center fuel tank through a short circuit in external wiring. Then, a latent fault on probes inside the tank most likely caused an electrical arc that ignited the flammable fuel/air mixture, leading to the explosion and structural failure. The 230 passengers and crew on Flight 800 paid the ultimate price when the accident exposed flawed assumptions regarding aircraft design practice. This study details those assumptions and emphasizes the need to continually re-evaluate our projects and equip our systems with additional layers of safety to protect against wrong assumptions and unanticipated failure modes.

Spektr of Failure

Mir-Progress Collision

November 08, 2010

1997 marked the third year of a collaborative space project between the United States and Russia known as the Shuttle-Mir partnership. This program sent U.S. astronauts to Space Station Mir where they worked with Russian cosmonauts on life science, microgravity and environmental research. Automated supply vehicles called Progress visited the station every four months to deliver fresh supplies and to collect accumulated rubbish. These spacecraft normally docked with Mir using a Ukrainian docking system called Kurs. However, Russia's financial difficulties and Ukraine's rising prices made the Kurs system unaffordable, and the Russians began implementing an existing manual docking system, called TORU, to dock Progress with Mir. In July of 1997, Russian Mission Control instructed the crew on board Mir to test this docking system on the Progress-M 34 freighter. The test ended in disaster when the Progress vehicle sailed past the docking node, slammed into a solar array and bounced into Mir's Spektr Module. The impact punctured the hull and caused the first ever decompression on an orbiting spacecraft. The lessons portrayed in this incident remind us that communicating and understanding the technicalities behind a system are crucial to making rational, informed decisions when off-nominal situations arise. It also emphasizes the importance of analyzing failure modes introduced by new systems, accounting for such possibilities and formulating backup plans.

Brace for Impact

MV Bright Field Allision

October 03, 2010

Flag-of-Convenience vessels are ships that have been registered in a nation other than the country where its owners reside. Motivating factors for such an arrangement include inexpensive taxes, cheap labor and low maintenance standards. The Motor Vessel (MV) Bright Field was one such freighter: it was operated by a Chinese crew and registered under the Liberian flag. On Dec. 14, 1996, the Bright Field departed the U.S. for Japan while carrying a cargo of American grain. Its voyage ended only hours after it began when an engine trip caused by low oil pressure left the crew powerless to navigate the massive freighter. Within minutes, the Bright Field veered toward the Mississippi riverbank and crashed into the New Orleans Riverwalk. Dozens of passengers on neighboring entertainment vessels and on the Riverwalk itself were injured in their attempts to escape, but remarkably, no one was killed. The impact destroyed many riverside facilities, and the incident points to incomplete risk management by riverfront stakeholders as well as by Bright Field operators. Comprehensive risk assessments are cornerstones to any mission, and this case emphasizes the importance of formulating plans to mitigate high-consequence scenarios.

Descent into the Void

Soyuz-11 Depressurization

September 09, 2010

When a crew of three cosmonauts concluded a pioneering 24-day mission aboard Earth’s first space station, an entire nation waited to welcome them. But joy turned to grief after recovery teams opened the descent capsule and discovered all three cosmonauts dead in their seats. A valve meant to equalize cabin pressure with Earth’s atmosphere moments prior to landing had been forced open prematurely to vacuum when the descent module separated explosively from the rest of Soyuz-11, suffocating the crew. Events leading to the depressurization show how unplanned mechanical shock led to single-point failure of a critical assembly, and how complex systems can defeat attempts to ensure comprehensive human understanding of a project’s design from concept development to operation.

Hit the Bricks

STS 124 Flame Trench Mishap

August 02, 2010

On May 31, 2008, the Space Shuttle Discovery launched from Kennedy Space Center's Pad 39A. Its mission was to deliver "Kibo" (or Hope, the centerpiece of the Japanese Experiment Module) to the International Space Station. After the launch, NASA Safing Teams set out to inspect the launch facility and were surprised to find the entire area littered with debris. Discovery's liftoff had produced dynamic loads strong enough to tear thousands of bricks from their anchors in the flame trench wall. The flying bricks, shown by radar to travel at speeds up to 680 miles per hour, damaged the opposite wall and a nearby security fence. Some bricks travelled distances exceeding 1,800 feet. Although the debris did not impact the Space Shuttle or compromise its mission, damages to the flame trench were estimated to cost $2.5 million. An aging infrastructure, an incomprehensive maintenance plan and oversights in the transition from Apollo to Space Shuttle conspired to weaken the structure of the flame trench until finally, it failed. This incident teaches us that decades-long vigilance over systems and infrastructures is crucial to identifying and rectifying hazards before they become mishaps.

Deadly Efficiency

American Airlines Flight 191

July 11, 2010

When American Airlines flight 191 began its takeoff run at the start of Memorial Day weekend 1979, everything seemed normal. As it had done on so many previous flights, the McDonnell-Douglas DC-10 roared across Chicago O’Hare’s Runway 32, bound for Los Angeles. This time, however, just as the plane became airborne, disaster struck. The left wing-mounted engine tore away from the aircraft and hurtled to the ground, rupturing hydraulic and electrical lines in the process. For a few brief seconds, the aircraft seemed to climb normally despite the damage, but then to the horror of hundreds of onlookers, the plane entered an uncontrollable roll to the left. Seconds later, with wings perpendicular to the horizon, the aircraft plummeted into a field less than a mile from the runway. This tragedy occurred because a change to a manufacturer’s maintenance procedure to improve cost-effectiveness damaged structure, allowing a design choice elsewhere to leave the aircraft uncontrollable in this unlikely but real scenario. Two-hundred seventy-three people lost their lives that day; their memory has been honored through improved maintenance standards, exhaustive design processes and strong communications across the industry.

Tragic Tangle

Soyuz 1

June 01, 2010

Facing extreme political pressure to regain dominance in the Space Race, the Soviet Union launched the Soyuz-1 spacecraft in April of 1967 as the initial phase of an elaborate spacewalk demonstration. Tragically, rather than bringing prominence to the Soviet space regime, the mission became a sequence of failure. Manufacturing flaws doomed the vehicle from the outset, and Soyuz-1 became the first in-flight fatality of space exploration. The circumstances from which this mission originated — where schedule pressures loomed so large that mission success superseded crew safety — provide many lessons applicable years after the Space Race. Although times and environments have changed since the Soyuz-1 incident, the constants of external pressure, uncertainty and risk live on. A historical perspective of Soyuz-1 shows that our assumptions when managing risk can mean the difference between mission success and failure.

Head-On Collision

Large Hadron Collider

May 02, 2010

Geneva, Switzerland, is the home of the Large Hadron Collider (LHC), the world’s most powerful particle accelerator. More than 10 billion dollars were spent in its design and construction in the hopes that data from the LHC experiments would forge new pathways in our understanding of physics. On March 27, 2007, 12 years after it was approved for construction, scientists placed the LHC through the final stages of pressure testing and encountered a serious failure: a support structure tore loose and lifted one of the 35-ton magnets from its base, spewing helium gas into the LHC tunnel. Investigators found that this costly mishap was due to a mere calculation error — a consequence of disparate development approaches, unfocused training programs, and delayed performance specifications. This incident highlights the fact that standards in design process and well-conducted, documented reviews are critical to an organization’s success.

Mission to Mars

Mars Observer

April 01, 2010

With the successful launch of Mars Observer, NASA once again set its sights on the Red Planet, a decade and a half after the successful Viking program. Mars Observer, designed to study the geosciences and climate of Mars, was the first mission of the Planetary Observer series, which was envisioned as a series of low-cost missions to the inner solar system. Unfortunately, it also turned out to be the last. After an eleven month journey through space, NASA lost contact with the spacecraft, only three days before the scheduled orbital insertion. With no physical or telemetry evidence to investigate, the Mission Failure Investigation Board faced a significant challenge. What they eventually learned were valuable lessons in the importance of testing, the potential consequences of making tradeoff decisions and the absolute necessity of a functional risk management process.

Island Fever

Three Mile Island

March 07, 2010

On March 28, 1979, a nuclear power plant reacted in a manner that was incomprehensible to plant operators. Widely considered to be the most significant nuclear power plant accident in the United States, the Three Mile Island accident began as a simple mechanical failure: a failed water pump. How could such a seemingly insignificant event lead to a near meltdown? After a number of state and federal investigations, the answer became clear: the simple mechanical failure was only the first of a series of failures that were only made worse by poor decisions based on misinformation. What we’re focusing on this month is that the fact that this accident could have occurred should not have been a surprise. As you read, note the similarities between the complex systems of a nuclear power plant, and the tightly-coupled systems we use at NASA.

Wire to Wire

Swissair 111 Crash

February 07, 2010

Sept. 2, 1998 — On a seemingly normal trans-oceanic flight from New York to Geneva, the cockpit crew of Swissair 111 smelled smoke. Over the course of just 21 minutes, an inaccessible onboard fire intensified to cause system failures that were non-recoverable, and ultimately caused the plane to crash into the Atlantic Ocean, just south of Nova Scotia. All 229 people on board lost their lives. Five years later, Canada's Transportation Safety Board found that engineering defenses of materials selection, cabin design and wiring placement had lacked sufficient testing. Further, administrative defenses such government oversight of standards and cockpit procedures had not prevented fire hazards from appearing in a location thought to pose minimal fire risk. Safety-by-design faces challenges when new subsystems with new functions are added later (in this case a complex inflight entertainment system), and adherence to safety requirements must be entrusted to a far-flung network of people. Today, NASA faces real challenges with respect to gathering and implementing human rating requirements for new space hardware systems built in-house and by commercial vendors. This story just hints at the problems to be faced.

Down to the Wire

Freedom Star SRB Recovery

January 01, 2010

During a Solid Rocket Booster recovery mission, a mishap occurred on the retrieval ship MV Freedom Star, after a tow wire jumped out of the tow chute. Although a crew member was seriously injured, the outcome of this incident could have been much worse. As you will see, this case study highlights the importance of identifying and adhering to safety controls, and also looks at the unintended consequences of failing to manage change within a high-energy system. As is the case in most accident investigations, it was fairly easy to determine how this incident happened. As an organization, though, we’re more interested in taking a look at the conditions that were present to allow the incident to happen. As you read, notice how a thorough safety analysis and hazard identification may have changed this outcome. Additionally, look at the effects of taking on multiple monitoring, operational and decision-making tasks, and how cumulative fatigue — something we all may suffer from, from time to time — may degrade human performance.

Fire From Ice

Valero Refinery Fire

December 07, 2009

"If you don’t use it for a year, toss it out." Forgotten items stored in closets, basements, attics can be easily recycled over time. But what if you don’t realize that a dormant system in the house itself hides a safety threat? Valero’s McKee Refinery Propane Deasphalting (PDA) Unit posed a hazard that no one saw coming. A control station in the PDA unit was shut down in 1992, and rather than remove or purge the idle subsection, the refinery simply closed the valves around the section and left it in place for 15 years. Cold February 2007 temperatures froze trapped water in the unused propane piping, cracking it. Leaking propane quickly ignited and set the refinery ablaze. Responders battled the intense fire but had to evacuate when flames engulfed local propane shut-off valves, defeating efforts to isolate the fire. Aging NASA facilities bear silent witness to past science and engineering achievements spanning two centuries; some infrastructure awaits renewal, some awaits demolition. Conversely, new designs for ambitious missions far from Earth must be able to withstand and adapt to unplanned events. In both cases, the capacity to detect and correct for emerging threats will spell the difference between loss or sustainment, failure or achievement. Read this month’s Case Study and consider latent sources of violent energy release, and our control over them.

The Tour Not Taken

Comet Nucleus Tour (CONTOUR)

November 01, 2009

No one knows for certain why the signal from the Comet Nucleus Tour (CONTOUR) spacecraft disappeared on Aug. 15, 2002. CONTOUR was a Discovery Program mission that was left almost entirely in the hands of NASA contractors and academic partners. It was a mission lost because of poor communication, misguided assumptions and models that were not faithful to reality. The project team had no experience with the Solid Rocket Motor they used, and they relied on the SRM manufacturer and a consultant for expertise. The team also vested much in the strong success record of the particular model they used, but previous designs did not match CONTOUR's specifications. In this study of CONTOUR, look for ways to consider if your project is leaning on insufficient supports or depending on past performance.

A Half-Inch to Failure

Minneapolis Bridge Collapse

September 01, 2009

When a major highway bridge in Minneapolis, Minnesota, fell on Aug. 1, 2007, the nation was shocked. Speculation about the cause of the collapse was widespread in the following days. Was it metal fatigue? Corroded steel roller bearings? A terrorist attack? Fifteen months after the disaster, the National Transportation Safety Board released an extensive investigation report. The proximate cause (gusset plate failure) surprised engineers and a public familiar with other bridge failure modes. The riveted metal plates connecting structural beams were assumed to be stronger than the beams themselves, but were in fact too thin to withstand years of ever-increasing loads. NASA may not build many bridges, but the I-35 story teaches us that hidden hazards can lurk within aging structures.

Lost in Translation

The Mars Climate Orbiter Mishap

August 01, 2009

More than any other mishap we have studied recently, the loss of the Mars Climate Orbiter highlights the need for comprehensive verification and validation. The Mars Climate Orbiter team did not ensure that software matched requirements. Because of this oversight, the team used software that reported the spacecraft's trajectory in English instead of metric units, a discrepancy that should have been caught by rigorous verification and validation. This problem was compounded by miscommunication, invalid assumptions and rushed decisions. On its journey to Mars, the spacecraft drifted away from the flight path its navigators were following. When the Mars Climate Orbiter reached its destination, it entered the Martian atmosphere well-below its intended altitude and disappeared. As we review the Mars Climate Orbiter this month, consider the progress we have made since this 1998 mission failure, but also look for parallel situations in the programs and projects you are working on today.

Triple Threat

Honeywell Chemical Releases

June 01, 2009

They say bad luck comes in threes. During the summer of 2003, that adage appeared to be true for one chemical plant in Baton Rouge, Louisiana. The plant, which makes refrigerant, inadvertently released three hazardous chemicals in one month. The incidents injured eight people, killed one person and exposed the surrounding community to chlorine gas. Despite the adage, more than bad luck produced this difficult month at the plant; a web of organizational issues contributed to the incidents. Although we do not manufacture refrigerants at NASA, we look to this story this month to learn about findings that are applicable to any pressure system containing hazardous liquids or gases. These incidents remind us that one unforeseen event can cascade into multiple unintended consequences. A key to combating this situation is thorough planning, which looks at process steps in a context of system knowledge. Carefully conducted hazard analyses, training employees for non-routine situations and respect for written operating procedures are all lessons we can learn from these chemical release incidents.

Shuttle Software Anomaly


May 01, 2009

This month we're looking at a recent close call that you're probably not familiar with unless you're part of the shuttle program: a software anomaly that surfaced on the Endeavor's mission last November (STS-126, 11/2008). The anomaly did not endanger the mission — or the astronauts aboard — but it caught my attention because the software assurance process for the shuttle is so rigorous that we almost never experience software problems in flight. The root of the problem we're looking at lies in the evolution of software development conventions and practices. The way software programmers code has evolved over the last twenty years, and a recent software change caused old code that depended on old conventions to fail. This incident points to the dangers of using heritage resources and highlights the key activities that must accompany any modifications to heritage hardware or software. Thorough verification and validation, well-developed processes backed by careful training and obsessive anomaly investigation will help us successfully continue to use the resources we have developed over the last fifty years.

Red Light

Train Collision at Ladbroke Grove

April 01, 2009

With the development of machines and automation to manage nearly everything in our lives, reliance on human initiative and decision is quickly becoming a thing of the past. However, as a result of this change, there is an ever-growing need for humans to effectively interface with machines. This month's mishap addresses the importance of prudent consideration in the design of the human-machine interface. During morning rush hour in London on Oct. 5, 1999, a commuter train passed a red signal into the path of an oncoming high speed train at Ladbroke Grove Junction, killing 31 people and injuring many others. The mishap investigation pointed to several problems related to how the driver and signalers interfaced with the equipment and displays around them. At NASA, when we rely on human action, we must be careful to design for human capability and limitations. We must design systems that consider human expectations and logic. To ensure success, we must supplement these designs with effective training and sufficient experience to enhance the likelihood that the proper actions are taken.

Cover Blown

WIRE Spacecraft Mishap

February 01, 2009

Times of transition often carry additional risk. Spacecraft launch, mission phase transition, system startup — these can be tense moments for NASA projects. This month's mishap illustrates the importance of considering every sequence of mission activities during the design and review process. Just moments after NASA's Wide-Field Infrared Explorer (WIRE) powered-on to begin its infrared survey of the sky, a transient signal from one of its components compromised the mission. The mishap investigation concluded that the team could have anticipated the signal if the review had thoroughly considered the start-up characteristics of WIRE's components. Instead, the design did not account for components' variable start-up times or their dependence on the time components were powered-off. The problem was compounded by a low-fidelity test set-up that led the team to dismiss anomalies during start-up. Testing and design focused on the mission objective but neglected WIRE's crucial transition from "power-off" to fully operational. At NASA, we need to focus closely on these moments of transition.

The Unknown Known

USAF B-2 Mishap

January 01, 2009

In NASA's climate of daring enterprise and unparalleled innovation, our efforts can be foiled by the challenge of simple communication. In this month's case study, the Department of Defense lost a $1.4 billion aircraft because one maintenance technician working on the aircraft was not aware of a workaround developed in the field. The technique was only informally communicated with local personnel and never incorporated into standard procedures. The Air Force investigation concluded that if personnel had had a better understanding of how critical specific systems were to the overall performance of the aircraft, they would have insisted on formally communicating the technique. The only people who truly understood the system interfaces were former B-2 engineers who had designed the aircraft ten years prior to the mishap. The operating organization lacked profound systems knowledge when a new environment required it. At NASA, we must continue to improve our strategies for capturing and transferring knowledge from personnel before they retire or move on to a new project. We must aggressively document changes and workarounds developed in the field. Lastly, we must strive to develop a broader understanding of the systems and programs we work with so we can recognize — and share — critical information. Even brilliant engineering and design will fail without effective communication in the face of change.

Under Pressure

Sonat Explosion

December 01, 2008

In any team with years of experience on long-term projects, complacency can slowly undermine critical task execution. This month's case study illustrates how group acceptance of a system that lacked design documents then precluded hazard identification and elimination. Further, informal operational processes transformed hidden design flaws into deadly high-energy conditions. A team comfortable with work processes that rely on tribal knowledge and verbal instruction will foster errors of omission and commission. Our objective is to encourage use of every opportunity to identify risks, hazards and other elements that could impact safety and quality. We need to develop designs and processes that meet specifications and mission requirements without compromise in safety. Hazards must be relentlessly evaluated and measured not only for their likelihood, but also for their impact on the mission, should exposure to that hazard occur. We must hold each other accountable for effective and continuous communication through our documented processes. Risk of failures increase when we grow comfortable that a mission is routine, and expensive controls or compliance can be stripped away because "we have done this before." Rigor in following formal process, including reviews, audits and evaluations that identify hazards, is a proven individual and team behavior that increases a system's margin of safety.

The Million Mile Rescue


November 01, 2008

The Solar Heliospheric Observatory (SOHO) spacecraft completed its primary mission in 1997 and was such a success that it has been extended multiple times currently through 2009. But in its first extension in 1998, we almost lost SOHO due to errors in code modifications that were intended to prolong the lifetime of its attitude control gyroscopes. The command sequence to deactivate a gyroscope did not contain the code to reactivate it. Due to a prioritization of science tasking, an aggressive schedule and limited staffing, these code modifications were not thoroughly tested before implementation. When SOHO started experiencing complications, standard operating procedures were circumvented in order to return SOHO to operational status as quickly as possible. The results were the failure to detect that one of the gyros was inactive, the progressive destabilization of attitude and the complete loss of communications with SOHO. It took us three months of labor-intensive collaboration with the European Space Agency to miraculously recover SOHO. This month's case study shows us how ignored review processes and circumvented operating procedures can severely jeopardize mission success.

That Sinking Feeling

Loss of Petrobras P-36

October 01, 2008

Budgets cuts and downsizing are a reality in every industry, but it is critical to maintain the integrity of safe operations through these times, especially with human space flight. While cost-cutting can drive many innovative solutions, a blind focus on financial performance can compromise safety considerations, forfeit thorough testing and result in poor decisions with catastrophic outcomes. Petrobras management had taken a clear stance to eliminate many standard engineering practices, redefine safety requirements and reduce inspections openly with the goal of improving financials. As a part of efforts to save cost and space on its Platform 36, the Emergency Drain Tanks (EDTs) were placed inside of the two aft support columns adjacent to the seawater service pipes. When one of the EDTs burst from over-pressurization, this set off a violent chain of events, including the rupture of the seawater pipe, massive flooding of the column and an incendiary explosion on the upper levels, ultimately resulting in the total loss of the world's largest offshore oil production platform and the lives of 11 crew members. The investigation found that no design testing or analysis had ever been performed on the EDT configuration. At NASA, we must impose a consistently high level of rigor to all of our testing plans and treat even common reconfigurations the same as we would a new design.

Fender Bender

DART’s Automated Collision

September 01, 2008

In April of 2005, multiple errors in the navigation software code that were overlooked during rushed testing phases caused the Demonstration of Autonomous Rendezvous Technology (DART) spacecraft to crash into the target satellite with which it was attempting to rendezvous. It was operating entirely on pre-programmed software with no real-time human intervention. The same navigational errors resulted in the premature expenditure of fuel and early end to the mission without the $110 million program having completed any of its close-range technical objectives. The DART team did not adequately validate flight critical software requirements, including late changes to the code that proved critical in this mishap. The program used an inappropriate heritage software platform and a team that lacked the levels of experience needed to operate with such little oversight. As NASA continues to push the envelope with newer cutting edge autonomous technologies, we must keep in mind the basic principles that make any technology program successful. Validation, verification and peer review cannot be sacrificed for schedule, and we must fully utilize our past experiences and expertise on current and new projects. We must be careful to ensure that we are not simply automating failure.

No Left Turns

United Airlines Flight 232 Crash

August 01, 2008

Detailed inspection throughout the lifetime of a safety-critical part is absolutely essential. The tail mounted engine on the DC-10 aircraft for United Airlines Flight 232 had left the manufacturing foundry with undetected microscopic defects. However, when establishing the safe operational lifetime, it was assumed that all parts were defect free. After 15 years of operations, numerous inspection teams failed to detect the growth of cracks from these defects, and the initial defect-free assumptions were never re-evaluated. On July 19, 1989, that engine exploded well before its set operational lifetime, severing all three hydraulic fluid lines. The pilots of Flight 232 had never trained for a complete loss of hydraulic controls nor were there any operating procedures for handling such a scenario. Still, because they thoroughly understood the DC-10 system, they were able to regain just enough control to crash land the plane using only the remaining engine throttles. While it is impossible to predict and then train for every conceivable situation, even some known scenarios are so complex and dependent on other variables that official documented procedures can be ineffective. Therefore, it is critical that NASA operators have a thorough understanding of our systems and operations, so that they are able to successfully navigate situations for which they were not explicitly trained.

Tunnel of Terror

"The Big Dig" Ceiling Tile Collapse

June 01, 2008

Understanding the limitations and failure modes of the materials that we use is critical in maintaining a safe operating environment. The improper selection of one key component for "The Big Dig" ceiling tiles resulted in 24,000 lbs of concrete crashing down on a passing car below, killing one of the passengers on July 10, 2006. The post-accident inspection was the first in the seven years since the initial installation inspection, and the same epoxy adhesive that had failed on July 10 was found to be in the process of failing on thousands of anchor bolts used to secure other suspended concrete ceiling panels. The particular epoxy chosen was simply too weak for the tunnel application. Reduced margins, forfeited review cycles, and ignored warnings all contributed to this disaster. Given the extreme environments and new technologies employed during NASA space operations, we must ensure that all critical materials are identified and thoroughly understood through robust testing, hazard analysis, and clear documentation.

Two Rods Don't Make a Right

Hyatt Regency Walkway Collapse

May 01, 2008

NASA projects customarily include contractors, subcontractors, third-party vendors, as well as international partners and other government agencies. Delegation of roles, tasks and authorities among team members is common and desired in these teaming arrangements. However, clarity in these delegations, along with a universal understanding of their limits, is crucial in any complex undertaking. This case study addresses a number of key issues associated with managing an engineering project with a high degree of multi-organizational cooperation and dependency. Of the many things that are naturally delegated amongst the parties involved, overall responsibility for safety and performance cannot be one of them.


Northeast Blackout of 2003

March 01, 2008

On Aug. 14, 2003, the United States and Canada experienced the largest electrical power blackout in North American history.  It was a massive power outage that affected parts of the northeastern U.S. and eastern Canada. Approximately 40 million people in eight U.S. states (about one-seventh of the population of the U.S.) and 10 million people in the Canadian province of Ontario (about one-third of the population of Canada) were impacted. The cost of financial losses related to the outage was estimated at $4 to $10 billion. The shutdown was the result of a monitoring and diagnostic systems failure coupled with communications problems between operations and support staffs, and a lack of systems understanding and planning by utility operators.

Fire in the Cockpit

The Apollo 1 Tragedy

February 01, 2008

A seminal event in the history of human spaceflight occurred on the evening of Jan 27, 1967, at Kennedy Space Center (KSC) when a fire ignited inside the Apollo 204 spacecraft during ground test activities. The 100 percent oxygen atmosphere, flammable materials and a suspected electrical short created a fire which quickly became an inferno. Virgil Grissom, Edward White II, and Roger Chaffee (the prime crewmembers for Apollo mission AS-204 — later designated Apollo 1) perished in the flames before the hatch could be opened.

Forrestal in Flames

Explosions Aboard USS Forrestal

December 01, 2007

On July 29, 1967, a tragic string of events culminated in disaster on the flight deck of the USS Forrestal resulting in the deaths of 134 sailors. As 27 fully armed combat aircraft were on deck in preparation for a bombing mission over North Vietnam, a wing mounted Zuni rocket was inadvertently launched from an F-4 Phantom. The rocket flew across the flight deck and penetrated an externally mounted fuel tank of an A-4 Skyhawk, flooding the deck with hundreds of gallons of jet fuel which quickly ignited. The fire engulfed the aircraft and spread quickly, fanned by 32 knot winds. One minute and 34 seconds later, one of that same Skyhawk's 1000 pound bombs "cooked off," with an explosion that sent shrapnel, flame, and destruction across the flight deck, wiping out the fire fighting crew, and wreaked havoc below deck. Over the next hour, eight more 1000 pound bombs exploded, each time taking the lives of another valiant team of sailors fighting the blaze. The ship was able to return to Subic Bay, Philippines, but fires continued below deck for over 24 hours.

Lewis Spins Out of Control

Loss of the Lewis Spacecraft

November 01, 2007

The Lewis Spacecraft Mission was conceived as a demonstration of NASA’s Faster, Better, Cheaper (FBC) paradigm. Lewis was successfully launched on Aug. 23, 1997, from Vandenberg Air Force Base, California on a Lockheed Martin Launch Vehicle (LMLV-1). Over the next 3 days a series of on-orbit failures occurred including a serious malfunction of the attitude control system (ACS). The ACS issues led to improper vehicle attitude, inability to charge the solar array, discharge of batteries and loss of command and control. Last contact was on Aug. 26, 1997. The spacecraft re-entered the atmosphere and was destroyed 33 days later. This mission may have been faster and cheaper, but in retrospect it was at the expense of better.


SL-1 Nuclear Reactor Explosion

September 01, 2007

In the early years of nuclear power development, the first small-scale boiling water reactor exploded catastrophically, claiming the lives of three engineering technicians. This nuclear accident occurred in January of 1961 at the U.S. National Reactor Testing Station near Idaho Falls, Idaho, and is the only nuclear accident resulting in the loss of life ever to occur in the United States. The accident, called a "prompt criticality," resulted from a variety of factors, including inadequate design, inadequate materials testing, and poor procedures and training.

Rocky Mountain Death Trap

The Mann Gulch Fire

July 01, 2007

Fifteen smokejumpers leapt from a C-47 aircraft on a hot, dry August afternoon in 1949 to engage what was believed to be a routine forest fire burning along the south ridge of the Mann Gulch, a steep, narrow, valley, situated directly east of the Missouri River. Over the next 90 minutes a complex, confusing, and heroic struggle ensued as the fire, fanned by high winds and downdrafts spread in unexpected ways, cutting off firefighters from their planned river escape path and roaring up the gulch with a wall of flame, superheated air and black boiling smoke. In the end, 13 of the firefighters lost their lives. This tragic event dealt a devastating blow to the Smokejumper program and drastically changed the way the Forest Service analyzes hazards and how its fire fighters are trained, equipped, led and deployed.

Innovation Pushed Too Far Too Fast

The Destruction of the R101

June 01, 2007

The R101 Airship story is one of political leadership spurring investment in new technology, but at the same time driving that new technology to a premature implementation and subsequent disaster. The maiden voyage of British-built airship R101 in October of 1930 ended in a fiery crash that killed 48 people when bad weather forced the massive airship down over Beauvais, France.


Eschede Train Disaster

May 01, 2007

In June of 1998, one of Germany's Inter-City Express (ICE) trains slammed into an overpass, killing 101 people. The failure was traced back to a damaged wheel that disintegrated just before the train passed over a switch-track, causing cars to derail and impact the bridge's sup-ports. Further investigation uncovered evidence of misuse of heritage wheel design, insufficient design verification testing, poor bridge construction and ineffective emergency procedures. As a result of this accident, major engineering changes and safety improvements were implemented on all ICE trains.

Atlas Centaur (AC-67)

Lightning Strike Mishap 1987

March 01, 2007

On March 26, 1987, NASA launched Atlas/Centaur AC-67 from the Eastern Test Range. The vehicle triggered lightning 49 seconds after launch resulting in GN&C failure and structural breakup. Both launch vehicle and Navy communication satellite were lost.

Safety Critical Software Control Errors

Radiation Cancer Therapy Machine Mishaps

December 01, 2006

In a 20-month period between June 1985 and January 1987, Therac-25 radiation therapy machines used in both the U.S. and Canada administered massive overdoses of electron beam radiation to at least six cancer patients, with at least three deaths attributed to radiation overdose.

Mishap at an Explosives R&D Laboratory

ATK Thiokol Explosives Lab

October 01, 2006

At 10:45 p.m. on Feb. 14, 2005, a explosion and fire occurred in the R&D laboratory building M-590 at the ATK Thiokol Promontory campus. Two laboratory technicians were transferring a chemical compound known as TETNB from a filter tray into 5-gallon plastic buckets. One of the technicians was killed, the other severely burned.

Ames Arc Jet DC Power Supply Fire

Ames Arc Jet DC Power Supply Fire

September 01, 2006

During Arc Jet operations a DC Power Supply Module was discovered burning Internal components were severely damaged Classified Type C mishap ($25,000 to $250,000 property damage) There were no injuries to personnel.

Davis-Besse Close Call

Davis-Besse Power Plant

September 01, 2006

On Feb. 16, 2002, Davis-Besse (Oak Harbor, Ohio) nuclear plant personnel were repairing cracks in the vessel head penetration (VHP) nozzles. While being machined, the nozzles which were supposed to be embedded tipped over. Further inspection identified a large penetrated cavity of 20 to 30 square inches. The cavity penetrated completely through the 6.63 inches of carbon steel to the thin stainless steel cladding liner. The liner (0.38 inch) was all that was preventing a large loss of coolant accident with potential catastrophic consequences.

Air Force Atlas Mishap

Unintended Mixing Lox and Hydrocarbons (1975)

August 01, 2006

On April 12, 1975, an Atlas 71F vehicle suffered extensive engine damage due to an explosion at liftoff at Vandenberg Air Force Base. Range Safety destroyed vehicle at 303 seconds.

Submarine Down

USS Thresher Lessons Learned

June 01, 2006

The USS Thresher was launched in 1960. The first ship of her class (nuclear powered attack), the leading edge of U.S. submarine technology, she was fast, quiet and deep-diving. On April 10, 1963, while engaged in a deep test dive 220 miles east of Cape Cod, Massachusetts, the USS Thresher, SSN 593, was lost at sea settling at a depth of 8,400 feet with all aboard. The crew of 112 Naval Officers and Enlisted personnel and 17 civilians died.

Are We Prepared?

Hurricane Season

May 01, 2006

2005 set the record with 28 named storms — 15 became full hurricanes and four major hurricanes hit the U.S. Katrina set the record for the costliest, with $80 billion in damage and for the deadliest, with over 1,300 people killed. However, Katrina was only the sixth strongest hurricane on record. The 2006 hurricane season begins June 1.  The National Oceanic and Atmospheric Administration
predicts 2006 to remain well above normal activity.

A Gift

Lessons from the STS-3 Close Call

May 01, 2006

Touchdown was 25 kts. too fast (220 kts vs. 195 kts EAS). CDR started nose down prematurely, then realizing error, immediately applied aft stick to stop nose down pitch. CDR needed excessive aft stick to stop nose, then was surprised by pitch rate reversal (pitch gain glitch). Second pitch reversal resulted in higher than desired nose gear slap down. Close call: nothing broken, nobody injured, but STS-3 came close to being two flights.

Fatal Mishap in Pressure System Operation

Pressure System Operation in a Government Lab

April 01, 2006

Early 2006, a pressure system failed during initial use in a government laboratory. One worker was killed, creating significant programmatic disruptions and possible personal legal consequences.

New York Chemical Waste-Mixing Incident

Chemical Safety Board Findings

March 01, 2006

April 25, 2002: An explosion in a mixed-use commercial building in downtown Manhattan is responsible for 36 injuries: 16 workers, six firefighters, 14 bystanders. Thirty-one of the injured were treated in hospitals, including four in intensive care. The street was closed for two weeks.

And Some Have Said, "Software Isn't Critical"

Ariane 5

February 01, 2006

On June 4, 1996, the European Space Agency (ESA) launched an Ariane 5 rocket from Kourou, French Guiana. The rocket was destroyed forty seconds after its lift-off. According to the report written by the Inquiry Board (published July 19, 1996) the proximate cause of the loss of the Ariane 501 was the complete loss of guidance and attitude information 37 seconds after main engine ignition sequence start (or about 30 seconds after lift-off). The launch of the Ariane 5 was its first, after a decade of development costing over $7 billion. The destroyed rocket and its cargo were valued at $500 million.

Refinery Ablaze

BP Texas City Refinery Fire

January 27, 2006

On March 23, 2005, a BP Texas City Refinery distillation tower experienced an overpressure event that caused a geyser-like release of highly flammable liquids and gases from a blowdown vent stack. Vapor clouds ignited, killing 15 workers and injuring 170 others. The accident also resulted in significant economic losses and was one of the most serious workplace disasters in the past two decades. The total cost of deaths and injuries, damage to refinery equipment and lost production was estimated to be over $2 billion.

A Tale of Two Failures

Delta II 7925 (1997) and Chinese Long March CZ-3B (1996)

December 01, 2005

"It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness, it was the spring of hope, it was the winter of despair, we had everything before us, we had nothing before us, we were all going direct to Heaven, we were all going direct the other way — in short, the period was so far like the present period, that some of its noisiest authorities insisted on its being received, for good or for evil, in the superlative degree of comparison only."
— A Tale of Two Cities, Charles Dickens

Death on the Steppes

Nedelin Rocket Disaster

November 01, 2005

Where: Baikonur Cosmodrome (also known as Tyuratam), Soviet Union
When: Oct. 24, 1960
What Happened: Fuel valves in the second stage of the Soviet's R-16 ICBM prototype were inadvertently opened, and hypergolic propellants mixed and burned into the first stage causing a massive explosion at the launch pad. The number of personnel and visitors in close proximity to the launch pad exceeded safe limits given that technicians were performing repairs on a fully-fueled rocket. At least 74 people died from the fireball and toxic gases, and approximately 50 more died later from the injuries received that day. Marshal of Artillery Mitrofan Nedelin, who was personally in charge of the R-16 Program, was at the launch pad and one of the 74 killed.

Failures, Mishaps and Root Cause Analysis

Hurricane Katrina

October 01, 2005

When looking at failures, such as those that contributed to the significant losses in New Orleans, it is necessary to look at more than just the immediately visible cause, which is often the proximate cause. 

Steam Locomotive Firebox Explosion

Gettysburg Railroad

September 01, 2005

The Accident: Steam locomotive 1278 with six passenger cars had completed two excursions and was preparing for a third and final excursion for the day. During a slow climb up moderate grade, the boiler exploded, seriously burning the engineer and two firemen.

Equilon Refinery Accident

Equilon Refinery Accident, Anacortes, Washington

August 01, 2005

A foreman and operators reviewed the drum temperature sensors and concluded that the drum contents had sufficiently cooled to un-head. The top head of Drum A was removed without incident or any further indication of the temperature of the coke at bottom of Drum A. Using hydraulic controls, employees lower the bottom head of the vessel. In a matter of six seconds, 46,000 gallons of coke, still at auto-ignition temperature, spewed out in all directions from the bottom of the drum. The coke ignited, engulfing six observers and standby workers in flames.

MGM Grand Hotel Fire Disaster

A Turning Point for Fire Protection Codes

July 01, 2005

Sparks from a short circuit in a hotel deli started a major fire at the MGM Grand Hotel in Las Vegas at 7:10 AM on Nov. 21, 1980. The fire engulfed the world’s largest gambling hall in smoke and flames. The fire was concentrated near the casino on the upper entertainment level. Thick black smoke filled the air ducts and escape stairwells in the 21-story guest tower causing panic and death. Eighty-five people died and more than 600 were injured, primarily due to smoke inhalation.

Bhopal: When Hazard Controls Aren't

Bhopal Gas Leak

June 01, 2005

During the nights of Dec. 2 and 3, 1984, a Union Carbide plant in Bhopal, India, began leaking 27 tons of a deadly gas called methyl isocyanate (or MIC). Not one of the six safety systems designed to contain such a leak was operational allowing the gas to spread throughout the city of Bhopal. Half a million people were exposed to the gas. About 8,000 died the first week and 20,000 have died to date. More than 120,000 people still suffer from ailments caused by the accident and subsequent pollution of the plant site. The ailments include blindness, extreme difficulty in breathing and gynecological disorders.

A Set-Up for Failure

USS Iwo Jima

April 01, 2005

October 1990: The USS Iwo Jima amphibious assault ship deployed to the Persian Gulf for Operation Desert Shield. The USS Iwo Jima docked at Bahrain shipyard for emergent repairs, but as the ship was leaving port, one hour after the propulsion plant was brought online, bonnet fasteners for a 4-inch valve supplying steam to the ship's service turbine generator failed catastrophically and 850 degree superheated steam at 600 psi escaped into the manned compartment. Nine sailors were killed instantly, one more fatally injured.

Listen to the Hardware: Missed Opportunities

DC-10 Crash (1974)

March 01, 2005

DC-10 Certification: The draft Failure Mode and  Effects Analysis was modified by Douglas to minimize design deficiency. A ground test failure in May 1970 was blamed on human error. In retrospect, poor design was downplayed as a root cause. During November 1970, internal memos between Convair and Douglas discussed proposed fixes to the cargo door problem but none was implemented. The Federal Aviation Administration certified the DC-10 on July 29, 1971, with an unsolved design deficiency. In March 3, 1974, a Turkish Airline DC-10 crashed in France killing all 346 people aboard. The cause of the accident was faulty latches on the cargo door which allowed the differential pressure in the cabin at 11,500 feet altitude to force the door to swing open to the outside of the plane where it was ripped open off its hinges by the air stream. After this accident, the entire DC-10 fleet was finally grounded and the cargo door locking system was redesigned and the problem eliminated.

A Deadly Mixture


February 27, 2005

On April 26, 1986, two huge explosions blew apart Unit 4 of the Chernobyl Nuclear Power Plant in the Ukrainian SSR. At least 31 workers and emergency personnel were killed immediately or died from radiation sickness soon after the accident. The nearby village of Pripyat, where most Chernobyl plant workers lived, had to be evacuated and sealed; some 200,000 residents of the area were evacuated. Radioactive debris was carried by clouds over most of northern Europe; long-term effects still being debated, but increased childhood thyroid cancer in Belarus and Ukraine is tied to the accident. This was the worst nuclear accident in history.

Need for Scenario-Based Accident Modeling

British Airways' Concorde

January 27, 2005

Experience has shown that multiple, unrelated and sometimes benign perturbations have challenged our systems in complex ways we would have never expected. High-consequence scenarios can emerge as a result of the occurrence of multiple unrelated events. Traditional system safety evaluations (e.g., Failure Modes and Effects Analysis) often model the response of the system to a single perturbation (failure or process deviation). Accident scenarios predicted by these models tend to be incomplete. From a risk management point of view, relying solely on such analyses, may cause relatively unimportant issues to receive excessive attention, while other important issues may go unidentified.

Save a Life, Pass It On


January 01, 2005

Strokes are a major cause of death in our society today. Strokes are difficult to identify and manifest themselves differently in different people. Early identification of the symptoms of a stroke often makes the difference between life and death.

Items per page: