The Actors of Resilience

lion king gif

I am going to shift gears a bit and move into a discussion of resilience engineering.  I was able to attend the recent symposium of the Resilience Engineering Association (http://www.resilience-engineering-association.org/) of which I am a member.  A common thread of my articles is to increase knowledge so we can increase the ability to respond to unexpected scenarios.  Organizations also can exhibit resilient behavior, and can create conditions to enable their “sharp end” personnel to also behave in an resilient way.  So what is resilience?

There continues to be a challenge balancing compliance and flexibility in order to create an adaptable organization with adaptable components.  A large part of the challenge of this is fully understanding the goal of where, or really, what a resilient organization acts like, or even what resilience is.  After digesting a lot of material, a good metaphor occurred to me while watching my 14 year old perform a leading role in the play “The Lion King”.

In a play the baseline is the story, which is winding towards a known end point (at least known to the players and the director).  The story is turned into a script and further broken down into scenes.  This is akin to our design of the system, the aircraft or hub operation or any other system we are trying to manage.  The aircraft, building, etc., become  “the set”.  Even a computer is filling that roll as it is relatively fixed in how it responds.  The script are the procedures we follow.  The actors work to follow the script and if all goes right everyone and every component does exactly what we expect, it is “work as imagined”, literally.  The director is, of course, the manager (s) or management.

However, in the real word, things do not go right all the time, or even most of the time.  A prop or set might not work quite right, or an actor might forget their lines.  What do the other actors do?  They improvise.  They adapt to make the story work towards where it is supposed to go where all the players and director know it needs to.  They may fill in complete sections, re-route around obstacles, what ever it takes to make the story work to the ending that they all know.  This is despite a portion of the set “getting stuck” and remaining on stage, or perhaps not being there when it should be.  This is despite another actor completely missing their lines, their cues or perhaps even not showing up at all!

I think that is an example of the adaptability we want.  It requires being ok with things going off script sometimes as long as we reach the end of the story where things turn out the way they were intended.  It also requires actors that are knowledgeable enough and have the skill set to improvise when they need to, pulling the divergence back into the story line of the script.  True graceful extensibility, as described by David Woods as “how a system extends performance, or brings extra adaptive capacity to bear, when surprise events challenge its boundaries”.   If done right it is seamless, literally, graceful.

The way to get there is by creating an atmosphere where people know that they are not judged negatively if something pushes them off script, but rather how well they can keep the story moving towards the end we desire.  We don’t create such rigid barriers that once we break through we cannot get back either.  If the control (to use Nancy Leveson’s term) is so coupled into the operation that the loss of the control makes it impossible to control through other means then we have not done our job.  An example might be envelope protection features on an airplane that are so good that pilots no longer have the skill sets to operate without them.  Pilots should not be judged in a simulator on how closely they followed the procedures, but rather how well the entire play was able to reach the proper conclusion.

Unfortunately, this is diametrically opposed to much of the current push in aviation, towards more compliance, specifically “procedural compliance”.  This is setting us up for failure.  We are emphasizing staying on script, not improvising as required to get the story to end where we need it to.

Posted in Safety | Leave a comment

Appearance on AirplaneGeeks.com

airplanegeeks

I am honored to have been a guest on the 401st podcast for airplanegeeks.com.  Great group of folks who are working hard to get accurate aviation information and news out to the world.   I discussed how an aircraft accident is an emergent property that arises out of a complex system which makes accidents generally not possible to predict utilizing traditional methods.  This is also related to the the work I am doing with Roger Rapoport and our book “Angle of Attack”, still in work.

Complexity is an interesting topic and not one that many people- even those involved in risk analysis or in the business of utilizing data to make predictions – understand.  It was somewhat entertaining when I spoke on this topic in Pasadena a couple of weeks ago that one could pick out the physicists and other scientists in the audience as their faces lit up when I mentioned the topic of complexity.  It is not a new one for physicists! An interesting aspect is that the same issues that that lead the inability to make predictions regarding accidents are, at their core, the same reasons that virtually no political pundits were able to predict that Trump would win the Republican nomination.  People laughed but agreed when I pointed out that Trump is an emergent property of a complex system!

Coming back to the podcast, here it is and I encourage a listen!  Captain Shem Malmquist on the airplanegeeks.com podcast.

 

Posted in Safety | Tagged , , , , , , | Leave a comment

Are pilots going to be eliminated?

We have become accustomed to the idea that we can do more with less and flying is no exception.  Improvements in technology first allowed airplanes to fly without navigators as long range radio navigation then inertial systems and most recently GPS coupled with simple computers have simplified the task.  The navigator’s job consisted of a combination of taking measurements off the stars and the sun with basic calculations of the projected aircraft path considering speed, wind drift and the like.  All of this was easily solved with the advent of calculators that could do trigonometry coupled with systems that could determine the current position, and it was thus a simple matter to combine the two.   While humans are removed somewhat from the process in this the pilots were trained well enough in the principles of navigation so that they could see if the calculations were not as they should be, much like a person using a calculator should be able to discern if the answer is far off the mark.

Independently, improvements in basic electronic systems and the simplification of system design allowed for the elimination of the flight engineer.  While many people may view this as “automation”, most of the jet transport designs currently operational do not include much in terms of having computers run the systems with the only notable exception being the McDonnell-Douglas (now Boeing) MD-11.  Improvements have been made in alerting systems to notify pilots of problems although most of these systems are not particularly “smart” but rather list problems in the temporal order in which they are triggered by the system, with the exception of the MD-11 which does (to an extent) rank the most critical items at the top of the list.   The systems have not so much been automated as improved such that the engineering design of systems are so simple they require almost no human action most of the time.

The MD-11 does include features that the other airliners do not, such as automatically reconfiguring systems to work around inoperative components or isolate problems, but it still will reach a point where it will defer to the human operator to make a decision.  An example is that it will shut down one hydraulic system due to overheat, but not two.  The designers considered that the decision to take a system beyond a certain point is too dynamic and dependent on circumstances.  What one might need to do while mid-ocean is far different than near an airport, for example.  Still, as most routine systems are automated the pilot is put into a monitoring mode.  If the system takes an action it is designed to notify the flight crew of that action.  In most cases the pilot does not have to take additional action but just consider the situation in terms of the inoperative components on flight planning.  However, like the person with the calculator it is very important that the pilot has a clear understanding of how the system works and how it should be operated.  The automation is not doing anything that the pilot would not be required to do absent the automation but the reliability of the system could lend a person to not do the mental work needed to understand the system.

Further, as the automation is able to remove the workload of operating a complex system there was less incentive for the engineers to simplify the system as was done on some of the other advanced designs such as Airbus or Boeing models.  The MD-11 systems are thus more complex, relatively.  This allows for the system to do things that are beneficial that the others do not do, for example, the MD-11 will sense fuel temperature and if it reaches a certain point it will move fuel from warmer tanks to colder tanks in order to prevent the fuel getting so cold it will no longer flow smoothly.  In the B-777 if the fuel temperature does get dangerously cold the pilots must use a combination of flying faster (increasing the temperature by friction) or descending to a lower (hopefully warmer) altitude.  However, the other side of this is the complexity can make it more difficult to understand.  There have been cases where the MD-11 system was properly reconfiguring fuel pumps and valves to keep the fuel in close balance and the pilot, not understanding the nuances and thinking that it was doing the wrong thing, turned the automatic controllers off resulting in an engine failing due to lack of fuel.  The system was smarter than the pilot, and we are back to the person with the calculator that does not know the process well enough to know if the answer is correct or wildly off.

The autopilot systems have improved somewhat, but are still not all that more of a complex problem than a cruise control in a car.  The system essentially looks at what the pilot has commanded it to do and then adjusts the controls to put it there.  The main difference being that the pilot can command it to follow an electronic signal from the ground or a path from the navigation system.  This functionality is not new, and while new autopilots do a better job it is not much different than the systems available in the 1960s and 70s.  The addition of autoland in the 1970s was also relatively simple in terms of just including the radio altitude into the mix so that the autopilot could then adjust the controls to maintain a programmed trajectory.  Obviously pilots will monitor this very closely as any failure or bad signal can put the airplane in a dangerous state.  The autopilot is not smart enough to discern that things “do not look right”, although some of the more advanced systems do disconnect when certain parameters are exceeded – leaving it to the pilot to “save” the airplane.

A pilot can also program the entire route, from departure to the approach to landing, prior to taking off.  Still it would not be unlike the ability to program your automobile’s cruise control to drive certain speeds at various portions of your route, it is just that the “cruise control” would also have control over your steering.  It is really just following a programmed script.   The system contains a database of points or a new point can be created via the latitude and longitude and these are just entered in the order the pilot wants the system to follow.  Altitudes and airspeeds that the pilot wants the system to follow can also be added to each point in the “flight plan”.   Despite the public impression, these systems rely heavily on human input and monitoring just as would a cruise control in your car that was programmable.  It is not possible to anticipate hazards, for example, so the traffic jam, icy road or other aspect would require human intervention.

There are currently several research projects that are looking for ways to further improve designs to that aircraft can be flown with one pilot or no pilots at all.  The primary incentive might appear to most to be financial, and perhaps it is, but the promoters of these systems argue that it is to improve safety.  They argue that the majority of airplane accidents are the result of human error and therefore by eliminating (eventually) humans flying airplanes we can achieve safety improvements not possible otherwise.  It was with this philosophy that Airbus first started designing limits to what pilots could do in their airplanes.  They had found that there was little benefit to allowing pilots to overstress the airplane or exceed certain bank angles or pitch attitudes, that pilots had not needed to exceed these conditions to prevent a problem but rather had only done so inadvertently.  Thus, the flight controls were designed to not allow a pilot to do so.  Boeing had not initially agreed, but the evidence was overwhelming and so the newer Boeing aircraft, while not entirely preventing such excursions, do make it much more difficult to do so.  To be fair, pilots can also take measures to do so on the Airbus designs as well.

All of this is not particularly an issue.  Designing a system so that one cannot do the wrong thing is far different than designing the decision making out the system.  A simple example is an automobile lock that is designed so that the car door will not lock a person out of the car or the system that prevents the car being started unless the brake is applied, or even the system that rings a chime if the lights are left on or a seatbelt is not fastened.  All of these do prevent errors but do any of them change the ability for the driver to make a decision?  Arguably, they do not.  The same is true for the current systems in airplanes, so there is some merit to the concept that better system design can eliminate many types of errors.

The aviation industry has been quite good at this, redesigning airplanes and systems to the point that most of the simple errors can be eliminated.  Those errors were often caused by a momentary distraction, an attempt to rush through a procedure or by poorly written guidance.  Through identifying and then eliminating these possibilities we have created a system where the chance of having an accident is now extremely low.  The fatal accident rate has steadily been reduced but in the past few years appears to have reached a plateau.  As system and equipment design as improved the failure rates have dropped and that has left just one primary cause of fatal accidents – human error.

The problem with this position is quite simply that it is wrong.  As pointed out by leading researchers such as Sidney Dekker, Erik Hollnagel and others, it is not that humans are making errors but that the remaining gaps are so dependent on human resilience to prevent accidents that those times when humans are not able to do so we view it as an “error”.  Think of it in terms again of the automobile.  Anti-lock brakes have certainly saved lives, as has improved signage on roadways, better designed highways, grooved pavement for high speed, banks on curves, better traffic signals, designs of automobiles that eliminate blind spots and improved visibility to other drivers and a myriad other things but safe driving still depends on human skill, particularly awareness.  We have used technology to eliminate the more simple problems but the larger ones remain.  As British Psychologist Lisanne Bainbridge pointed out, “the designer who tries to eliminate the operator still leaves the operator to do the tasks which the designer cannot think how to automate”.

Now imagine we design systems that eliminate crashes and the like. Take for example Google’s self-driving car.  Certainly it can drive automatically, motion detectors and a very good navigation system can be supplemented with updates of position as it passed roadways.  It can avoid obstacles and the like, so what might create a problem for the current system?  Have you ever driven down a road and saw an issue that was only potentially a problem?  Imagine looking down the road and seeing activity that you recognize as two street gangs facing off against each other.  The road is otherwise clear and to the Google car it is just a normal situation.  It does not notice the tell-tale signs of a street gang or perhaps an angry mob so will carry you on right into the middle of it.  These are the types of issues difficult to solve with current technology.  These are the types of aspects that humans are still far better at and an airplane has many more than a car does.

In an airplane the set of issues which are difficult to program are larger.  Take for example the smell of smoke.  In the car there is no need to program anything, the occupant pushes the emergency stop and you’re done.  In the airplane it is not so simple.  First, the system would need some way to detect the smoke, and this too is not so simple.  Back in the car if you smell something a bit odd you can go on driving waiting to see if it manifests into real smoke.  In the airplane the first indication might be just a subtle change in odor.  As there are large amounts of air moving through various systems odors change all the time.  However, a review of real events have shown that waiting until it was clearly smoke is sometimes too late to prevent catastrophe.  Then there is the issue of what the options are.  Flying over the North Atlantic at night is it better to ditch or try to make it to the nearest airport?  Is it better to depressurize the airplane and slow the burn rate from the fire (depriving it of oxygen) to be able to survive the flying time to the nearest airport or is it better to dive to a lower altitude so passengers do not run out of oxygen?  Can a computer make such a decision better than a human?

Other things that are difficult to program include subtle aspects such as a very small change in sound or vibration.  Is that a serious change or not?  It takes a lot of experience to be able to discern the difference.  Then there are aspects that would require engineers who design systems to better understand, such as meteorology.

All of this ignores the issue of power.  If there is something that interrupts the electrical power of the airplane and its components, how is the computer flying the airplane to be protected?  A good example is fire that is burning through power systems.  So until computers increase to the point of being at human-level and the power issue is resolved the concept of replacing humans pilots in airplanes with computers is not realistic.  Once we reach human level artificial intelligence then there is another set of issues which make more of this entire topic likely moot, as will be discussed land the end of this article.

So what about the idea then of leaving a single human in the cockpit to capture these issues?

Humans are subject to a number of cognitive limitations and historically it was considered that having more than one pilot improved safety because the second pilot could catch the first pilot’s errors.  True there are plenty of aircraft that do operate with a single pilot, which include everything from military aircraft to very sophisticated private aircraft.  Smaller charter type airlines also operate this way in many cases so clearly the workload issues could be worked out, at least during routine flights.  However, the accident rates for these categories are much higher than would be accepted for mass transportation.  Partly this is due to people making mistakes, misperceiving things and the like, but as we create systems to capture these we still see accidents.  Why is that?

It is the general view that humans are error prone and that led to the idea of removing humans entirely out of the equation in the first place. The generally accepted reason adding a second pilot improves safety is that a second person will notice these errors and speak up about them.  Indeed, that is the basis for the crew resource management (CRM) training that was instituted starting sometime in the 1980s.  The concept was that by training pilots to speak up when they noticed a problem we could solve many issues, and underlying this is the assumption that in many cases the second pilot did notice an issue but did not point it out for various reasons.  These could include fear, power-distance, not wanting to “make waves” or even anger.   There were perhaps some actual cases of this and so CRM training became “the fix” to solve this problem.  Simultaneously, equipment became more reliable, the design of procedures became better and systems were installed that would warn pilots of dangerous conditions, such as approaching terrain, windshear, too steep of a bank angle, approaching a runway on the ground, a collision risk and so on.  In addition, ground based systems were installed so that air traffic controllers would also get alarms and be able to shout a warning for many dangerous situations.  And, accident rates did go down.

So, was it the CRM that led to this improvement?  The truth is that the evidence to support such a conclusion is weak at best.  It is not to say that CRM is a bad thing, but rather that it may have not made as much impact as its proponents desired.  The same can be said of other programs.  One example was a program that essentially centered around the concept that the more  precise a person attempts to be in all areas of their life the “smaller the target” will be and the less likely they will be to deviate from what they intended to do.  Certainly not a harmful concept, but there is no evidence it has any correlation to accident prevention.  Other emphasis is easier to see, such as improved diet, hydration and, of course, mitigating fatigue.  Fatigue can definitely be a problem and people do make more errors when tired, but the entire approach is again based on the assumption that errors are the problem when the real issue is the loss of resilience as discussed previously.

That said, obviously a second person can help capture errors, but the real value comes in that a second person, if the pilots are properly trained to work together, create a shared mental model and then act, essentially, as one mind but one with multiple senses.  That also has a multiplier effect on the experience level and the ability to cope with unusual situations. This does not just double the resilience found in one person, but rather magnifies it.  Humans are able to accommodate variability and two well trained humans working closely together are better than one.  More can even be better as Captain Al Haynes described after surviving the Sioux City accident.

To make this work the system would need to be smart in that it would anticipate what the human needs.  That is something the other pilot is doing, they are not only able to react, but actually anticipate the needs and actions of the other pilot.  Computers are rather limited at this currently, “auto-correct” being a case in point.  Would we really want a virtual computer “co-pilot” that reacts as “auto-correct” does?

The proponents of the concept counter by stating that they can have a person serving on the ground as a “virtual co-pilot”, ready to assist if needed.  The person on the ground would attend to multiple flights under the premise that only one might have an issue at a time.  This makes one wonder.  Have you attended a virtual meeting?  Even under the best of circumstances there are limitations and subtle cues are missed.  Hand gestures, facial expressions, many other nuances would be lost. In reality, the person on the ground is working as a dispatcher.  Dispatchers are already part of the decision team for safe flights for all major airlines so we are not adding something particularly new.

So the disadvantages of this scheme would be to lose the “shared mental model”, because no matter what sort of data connection there is the second person on the ground would not actually “be there”.  They would not be experiencing the sensations, they would not have “skin in the game”.  Even assuming that there was some way to transmit some of those aspects they would still need continuity.  Not being completely immersed in what was happening until there was a problem would not be unlike the situation with the Air France 447 Captain who did not come up to the cockpit until after the airplane was stalled.  He had almost zero chance of sorting out the issues.  If we want to fix that then the person on the ground would need to be virtually “in the cockpit” for the entire flight, which would mean that they would not be virtually able to be in multiple cockpits simultaneously.   Of course, once we have done that we have saved nothing.

Finally, there is, of course, the problem of someone who is suicidal or similar.  The Germanwings case highlights that issue, and there is no known psychological testing that would ferret out that sort of issue reliably. In sum, it is clear that the impediments to both pilot-less and single pilot transport aircraft are larger than most realize.

So what about artificial intelligence?  There is certainly no question that once computers reach human level cognition that they will be able to fly an airplane as well as a pilot can.  They would need appropriate sensors to pick up subtle odors, vibrations and sounds, but that is not a difficult problem.  Methods could be devised to ensure they are powered, so that issue is also surmountable.

It is estimated that computers of this level could be operational in the next few years, although others believe it will be longer than that the overall consensus is that they will be up and running by the end of the century.  Is this something pilots should be concerned about?

The answer is “perhaps”, but the real truth is that once human level intelligence is created the world will be so completely changed who flies airplanes is likely not to be a large concern, even for those that currently make their living at it.  The reason is that this level of artificial intelligence referred to as “artificial general intelligence” (AGI) is very unlikely to be like a Hollywood movie.

Current technology includes a lot of what is considered “artificial intelligence” or AI.  This level includes predicting words on a smart phone or tablet, and numerous other applications.  Google’s new car is in this regime.  These systems are able to “learn” on their own and as they “watch” what you do and so improve their performance.  Cool stuff.

It seems to follow that if you create an intelligent enough computer that is learning on its own as some point it will be much equivalent to human level, and as humans we tend to anthropomorphize objects so we think of it as being much like it would be human-like.  Indeed, it would mimic many human traits as it would be logical to design it to speak, etc.  However, as much as it might seem to be human, it would not be.  A computer is not even a biological organism.  Tim Urban has a very good discussion on this topic, and one illustration considers a spider and a guinea pig.  He points out that if one were to make a guinea pig as intelligent as a human it might not be so scary, but a spider with human level intelligence is very unlikely to be friendly or a good thing.  A computer is not even a biological organism so is actually more different.  Tim points out that a spider is not “good” or “evil” as Hollywood likes to portray things, but is rather neither, it is just different, and likewise would a computer.  It reacts to things based on programming, but once it can self-learn and at that level, its motivations are based only on what its job is set to be.   A short example of how this could go wrong is illustrated in this SMBC comic.

Humans are social animals, primates with the social structure of termites.  We survive as a result of that social aspect.   Termites have evolved to “know” that any individual will sacrifice itself to prevent the destruction of the hive.  Altruism and self-sacrifice is not something we attribute to insects, but the fact is that termites, bees, ants and other social insects will take actions that in humans we would consider altruistic.  The fact is that by being social animals humans are able to succeed where non-social animals could not.  As a result we have evolved to be social and our “programming” reflects this.  It might be described very roughly as follows:

  1. Prevent harm yourself unless (in order);
    1. Your family is in danger, protect them first.
    2. Your “tribe” is in danger;
    3. Your Nation is in danger;
    4. Prevent harm to another person outside your group.

The last items might or might not occur, many will protect themselves before helping strangers.  The programming varies.  Very few would not give their life to protect their children, their spouse and immediate family and we are programmed to keep it in that order.  The entire point of all of this is to ensure our genes survive, and it is better to ensure that even a bit of your DNA makes it (you will have relatives most likely in your tribe, nation, etc.), and it seems that the stronger the DNA connection (or in the case of a spouse, the likelihood of ensuring your genes survival) the stronger our will to do anything to protect them is, even at your own expense.  Of course, intrinsic in all of this is self procreation.

Obviously, a computer only has the traits we have programmed it to have so as much as it might appear to be human, it really is not.

This leads into the larger concern.  Computers today process information millions of times faster than humans but still lack human intelligence.  This article is not intended to get into the technical aspects of how our brains are structured, but suffice to say the structure of our brains allows for ways of processing information and connections that computers are not yet capable of.  Once they are, though, they will be combining that faster processing with those connections.  Connect to the internet and things happen fast.

Consider a computer with this capability and the ability to learn.  It starts as a toddler but an hour later has the ability and knowledge of Einstein, and an hour after that is has the combined knowledge of all the great thinkers combined.  Unlike us, it has constant access to ALL of that knowledge and a much faster processing skill.  The problems that take us years to solve or appear without resolution are likely to be trivial for it.

Is there any doubt that it could rapidly know ways to make itself even faster and smarter?  Does it have the ability to adjust its pathways and improve its structure based on its knowledge?   If we give it that ability, or it figures out how to do it on its own, then the intelligence can increase even faster.  Above AGI is artificial super intelligence, or ASI.  In this realm we are looking at a computer that is not just a little more intelligent than us, but instead millions of times.  This is a system that might realize there is a way to manipulate matter, time or space.  It would not be limited to our perceptions of reality. The trouble is that it is farther above us than we would be to an ant, and we might not be relevant to it or just a nuisance.

All of this is such a tremendous game-changer that who is flying our airplanes becomes somewhat of a trivial issue.  ASI could lead to solving all the problems of humanity or the end of humanity.   People like Stephen Hawking and Bill Gates are very concerned. Elon Musk is so concerned he states he spends a third of his waking hours thinking about it.  This while running several companies!  Hopefully it will turn out well.  If it solves all of the problems of humanity than all of us may be able to live just doing what we want without any real need for work.  If it goes badly then  none of it will matter either.

The bottom line here is that we might see a push for single pilot or even no-pilot airplanes, but if we do it is based on a fundamental misunderstanding of what the issues are and where the risks lay.  We might automate the basic functions but that would still leave us vulnerable to the real “corner-point” scenarios that lead to actual accidents.  Contrary to popular opinion, most accidents do not follow a simple linear causal chain.  It would be safe most of the time, true, but not as safe as the public demands today.  It might plateau to safety levels reached in the 1970s or so.  Reaching the higher safety levels now demanded by the public and regulators would require AGI, and once we reach that point the outcome moves in directions that are beyond the ability to reasonably anticipate.

 

Posted in Safety | Tagged , | Leave a comment

A probabilistic world

ovals_pink_and_blue1

As pilots we need to make decisions based on reality.  Anything else can lead to a very bad outcome, but what is reality and can we perceive the difference between reality and perception?

We live in a probabilistic world.  This may seem counterintuitive, and it may be true that there is an objective reality somewhere – a deterministic reality where an input leads to a very clear output – but even if that is true (and it is not clear that it is), we cannot perceive it as such as we are limited as to what we observe by our own senses.

Is that color depicting the oceanic areas on the map at the top of this page blue?  This perception will depend on the individuals color perception.  A person that is color blind might perceive that area differently than someone who is not, but what the color blind person perceives is the reality to them.

Each person differs in their individual perceptions, and thus, their concept of reality is tainted by deficiencies in their own sight, hearing, sense of touch or equilibrium.  Their perception is further impacted by biases that they might have (as discussed in my previous post here) and other factors such as fatigue.

Human reaction is based on what we believe is most likely to be true given the combination of what we perceive as filtered through the factors described above. If we are aware of something that might impact our ability to discern objective reality (such as a color blind person described above might do) then we will adjust our estimate of the probability that we are correct accordingly.

As a pilot we are even more removed from reality than most as we perceive a world that is far from what we evolved to be able to perceive.  There are many examples of this.  Accelerations distort our reality and we need instruments to be sure which way is “up”.  On landing in a big airplane we will perceive motion based on what our bit of the airplane is doing which can be quite different than what is happening at the landing gear.

Pilots are particularly dependent on the probability of the situation but must be constantly aware that what they are perceiving might not be reality.  Are our instruments correct?  Is the sense of acceleration due to our pitch attitude or our forward acceleration?  Is that apparent increase in height due to our local changes or is the airplane actually changing height?  We cannot trust anything that we see but must weight all of it based on conditional probabilities.  Is X true given Y? Are there other factors that will make it not true?  As pilots we make these determinations through experience and it would be the rare pilot that would calculate the probability using something like Bayes theorem.  Regardless, we constantly need to second guess our assumptions and reassess as new information comes in.  Is there evidence to support what we believe to be true?  How reliable is that evidence?  Is that actually evidence or just what we want to believe?  These are the types of issues that can lead a pilot to pressing on into bad weather or with insufficient fuel just wanting to believe their own construction of reality.  The same is true of a pilot descending into terrain.

Risk assessment.  It is what we do and pilots are quite good at it but we need to constantly train our brains to ensure that we are making decisions based on actual evidence and not what we believe to be true as a consequence of bias and perceptual limitations.

Posted in Safety | Tagged | Leave a comment

“The Fall of Saigon: FedEx Aircraft Mechanic Reflects on Journey from War 40 Years Later” –An amazing story

This post is to honor someone who deserves to be recognized.  Sometimes the job of flying puts us in contact with amazing individuals.  I have been fortunate to meet quite a few in my lifetime and perhaps that will be a topic for another post.  This is a story of one of them. In August 2014 I carried a remarkable individual on my jumpseat.  He did not think his story impressive, but I strongly disagreed!  The very next day I sent the following email to one of the FedEx corporate communication people I had worked with on various projects:

“Let night I had the honor of carrying one of our mechanics, Mr. Do, on the jumpseat.  I invited him to ride up front, and during conversation I found out that he is quite a remarkable man.  It is a long story, but a few details  include that he was part of the South Vietnamese military, and after the U.S. pulled out and Saigon fell, he was captured, placed in a concentration camp, from which he eventually escaped, hid in the jungle, built a boat that he used to attempt to sail to Thailand, had their engine fail, drifted eventually into an oil platform, was rescued, and finally found himself in the U.S.  He has gone back to visit (since it opened), and that is quite a story as well.
I think that this would make a wonderful story, so I am sending his contact information to you in hopes you can either do it yourself or get it to the right person.”

I am honored to say that the story has now been published and you can click on the image below to view it and watch the video.  The URL is also listed below.

fall of saigon

http://about.van.fedex.com/blog/saigon-reflections-40-years-later/

Posted in Safety | Leave a comment

Indonesia AirAsia Flight 8501

airasia

The Final Report of the Air Asia 8501 has been released.  This was a loss of control accident that does contain some lessons that are worthwhile sharing.

There had been an ongoing maintenance item that resulted the ECAM message of a AUTO FLT RUD TRV LIM SYS.  While it later turned out to be faulty wiring, their maintenance appeared to be treating it each time as a “one off” type of event rather than a repetitive maintenance item.  The maintenance procedures for a repetitive issue is often different than it would be for a single event.  Those of us who fly “electric jets” know that having a temporary “nuisance” type of alert is not rare, we call them “stray electrons” and are often the result of a slightly delayed power transfer or similar.  As a result they can often be fixed by just waiting a minute or, if that doesn’t work, the system can be rebooted.  These are issues common to any computer, whether it is your iPhone or an airplane.  Of course, in the case of our airplanes there are many systems that are inter-related and so a momentary glitch in one system can lead a second system to not start up correctly, etc.

The point here is that if it is a repetitive issue then it is likely a real fault of some sort and so when we write up a problem it is important to note that it is repetitive.  Now most company computer systems will also be tracking these but by putting the words “repeat item” in the maintenance logbook we can reduce the chance that it will be overlooked and proper procedures applied.

The next issue worthwhile looking at was what happened next.  Apparently the Captain on this flight had seen this maintenance item before and was watching while a mechanic “fixed” the alert by pulling some circuit breakers for the Flight Augmentation Computers (FAC).   During the accident flight the alert appeared several times.  Finally, after the fourth time, the Captain decided to pull those breakers inflight.  This is not a book procedure and resulted in the flight control system reverting to alternate law and the autopilot disengaging.  Procedures published for system problems are carefully thought out and absent an extreme emergency there is no reason to deviate from the published procedures.

The result of the alternate law resulted in the roll control on the control stick going from a rate command (you make an input and it sets a desired rate) to direct command (similar to a non-FBW airplane), while the pitch command remains in a mode pretty much similar to normal law without protections.  This means that in pitch the command on the stick is a “g” command, in which a neutral stick is commanding 1 g which will mean it will not change its vertical velocity, but unlike normal law it will not prevent the aircraft from stalling.

For reasons suspected to be distraction the first officer, who was flying, did not immediately notice that after autopilot disconnect the aircraft started to roll.  When he did recognize it he started to recover with both a roll and an aft stick movement and rolled from a 54 degree left bank to 9 degrees in under 2 seconds.  This might have created some other effects such as a vestibular illusion.  Following this, like AF 447 it appears that the first officer had a difficult time keeping the wings level.  This might seem surprising but remember that at FL 320 the aircraft was in a regime in terms of low air density that the first officer had likely never flown the aircraft, and, like AF 447, this was coupled with flying in a control law that was relatively unfamiliar.  In multiple previous events of this nature it resulted in some challenge for the pilot to control the roll.

The pull back is not explained but it may just be the nature of a rapid mechanical input where the hand moves to the right and back at the same time.  In any event, it resulted in the aircraft to rapidly pitch up.  From the reactions of the crew it appears that they were completely startled.  One wonders if they thought they had some secondary flight control issue going on, as it is clear that there was some confusion.  In any event, the aircraft progressed into a stall and aft stick position was maintained as the angle of attack increased to extreme values.  The communication between them was in English, which was neither of the pilots first language, and that might have contributed to the confusion.  For reasons unknown the Captain did not authoritively take command of the airplane and it appears that neither pilot recognized the stall.  For discussion on that, see my previous article.

It should also be noted that the aircraft descent rate became extreme and so was going to be much less than 1 g, let alone the higher g-demand that was being commanded by the aft stick pull.  The aircraft elevators will work to try to maintain the g-demand so even with the stick neutral if the aircraft is experiencing less than 1-g then the elevator will try to pitch to hold that 1-g.  Also, it should be noted that pushing forward on controls when at half a g or less requires overcoming a lot of natural human response.

One more note is that the Captain of this flight had considerable aerobatic and hands on flying as a military pilot so those that believe that more of that type of flying would prevent such accidents need to reconsider.  Clearly training is necessary and it is also clear that the industry has not addressed these issues satisfactorily at this time.

Posted in Safety | Leave a comment

High Altitude Stalls – how well do you understand them?

contrail

High Altitude Stalls – how well do you understand them?

By Captain Shem Malmquist

Acknowledgements

Credit for the impetus of this article must be given to my friend, aerodynamicist Clive Leyman, who initiated a discussion on these issues.  He provided the technical foundations, including correcting and clarifying portions of this article.  Portions of this article are based on a paper written by Clive Leyman, which have been revised as necessary for a more general pilot audience.

concorde

Stalls and modern wing designs

There has been much written in the aftermath of the Air France 447 accident regarding aircraft stalls, pilot training and similar aspects.  In a previous article I outlined some of the cognitive aspects that were likely involved with the accident.  Many pilots have wondered why the Air France crew did not recognize the stall itself.  In this article I will explore some of the aerodynamic effects involved in high altitude stalls which can make the problem much more complex than many pilots might realize.

Modern airliner wings have been designed to minimize drag at the design cruising Mach number.  The aerodynamic design of the wing is based on the necessity to reduce wave drag and lift induced drag, and may be modified to reduce wing bending moments and weight.  These compromises mean that the way the air starts to separate as the angle of attack approaches the stall, and what forces are generated as it does so, may be significantly different than what most pilots are expecting.

As outlined in the figure below, on a typical airliner wing, the air will start to separate about 2/3rds of the way from the root to the tip.  It starts on the aft portion, so the forward section of the wing is still developing normal lift.  This results in a gentle pitch up.

stall progression

As the angle of attack is increased, the separation will move forward and across the wingspan, but these separations are still aft of the CG, so the aircraft will continue to have a gentle pitch up.  It is only after the inboard part of the wing stalls that there will be any pitch down at all, but in reality, this will likely just appear as a cessation of the pitch up.

Modern wings are designed to be “supercritical”, meaning that they are designed such that during normal cruise over a large part of the wing upper surface the airflow is supersonic, decelerating through a shock wave lying about two thirds to three quarters wing chord.  It is across this shock wave that the initial airflow separation will most likely begin, and that is near the trailing edge of the wing.  This will be perceptible as buffet.  Additionally, some of the lift is produced by positive pressure on the lower surface near the trailing edge.  This has the effect of increasing the lift on the wing with increased angle of attack for quite some time after the air flow on the upper surface as begun to deteriorate.

The experience most pilots had in primary training is a bit different.  In most trainer aircraft during the stall the airflow separation results in a loss of lift early in the process. As the airflow continues to deteriorate there comes a point where there is a fairly significant pitch downwards known as the “stall break” which is coupled with a simultaneous significant loss of lift.  The buffet is significant, and very obvious.

By contrast, the modern airliner wing will still see the lift increasing somewhat after the point that the pre-stall buffet has occurred.  This may, or may not coincide the the AoA at which the stall warning is triggered, with choice of that point left to the designer. Beyond that pre-stall buffet the lift goes on increasing very slowly (from the bottom surface flow), but the buffeting gets steadily worse.  At some point there is a change in the character of the buffet (magnitude and frequency) accompanied by a loss of lift. Taken together these may define “stall”, but it is difficult to identify the exact point without recourse to instrumentation. This is quite unlike anything met in training.

A word about high speed buffet

In addition to low speed buffet associated with the stall, many pilots also have read about, or been taught that the aircraft will experience a “high speed” buffet if they fly above the maximum mach speeds.  It is possible that some pilots might be concerned with lowering the nose as they might believe they are entering “coffin corner”, and lose control of the aircraft.  In reality, while this was a factor in early jet transports, it is no longer the case in modern designs due to aerodynamic improvements.  Any buffet experienced is almost certainly going to be due to pre-stall or stall buffet.

Stall identification

The JAR (see Annex) and FAR rules specify that; acceptable indications of a stall are –

  1. A nose-down pitch that cannot be readily arrested and which may be accompanied by a rolling motion which is not immediately controllable (provided that the rolling motion complies with JAR 25.203 (b) or (c) as appropriate; or
  2. Severe buffeting of a magnitude and severity that there is a strong and effective deterrent to further speed reduction; or
  3. In the case of dynamic stalls only, a significant roll into or out of the turn which is not immediately controllable.

As previously described on many modern wing designs the airflow separation will slowly spread outwards and forwards from the initial point.  This means that any changes in pitch or roll from the approaching stall can take place over a relatively long period of time (depending on the rate the AoA is increased), and there are no sudden indications.  This can somewhat mask the approaching stall, which can then only be identified through the severe buffeting criteria.  Further complicating this is that the buffeting could be mistaken for turbulence, as appeared to be the case in the other events described in my article on cognitive bias.  It is possible that mountain wave action or turbulence can induce a stall warning and possibly even pre-stall buffet at high altitude, although it is unlikely to cause an actual stall.

Air France 447

ap_air_france_airbus_a330_thg_120606_mn

AF 447 was in cruise flight at FL 350.  The Captain had chosen to take the “middle” nap, which is typical.  Unless unusually tired, most Captains will take a turn in the middle of the flight so they can be present for the more complex procedures during the first and last portions of the flight.  This can be altered, of course, depending on when they are tired, the Captain might choose the first or last period also.  The area of weather was still about 80 miles ahead of them when the Captain went back to take his nap.  Thunderstorms are typically somewhat numerous crossing the tropics, and it is probable that the aircraft radar was not depicting anything of significance that far out.

While some have questioned the decision of the Captain to take his rest at that time, it is not so surprising given the information present.  There was weather ahead, but it likely did not look particularly significant on the radar.  Again, there is a good chance it was not depicting much at all, as I have described in previous articles (here and here).  Further, it was likely that there would be more storms over the next several hours as they continued across the tropics.  Waiting was unlikely to improve things.  Personally, I would have chosen the first or second rest period just because I do not sleep well in turbulence and I have a bit more training with radar usage than many, but it is hard to second guess this Captain’s decision.  This left the First Officer in the right seat, as the flying pilot, and the relief First Officer in the left seat, as pilot monitoring.

When the aircraft encountered an area of high altitude ice crystals resulting in the loss of airspeed indications the autopilot and autothrust disconnected, unable to function without airspeed.  With the loss of airspeed came a multitude of various warnings as each system that utilized airspeed in some way indicated a problem.  The flight controls reverted to a degraded state where the pitch mode was still working normally in terms of response, but no longer had any of the stall protection features.  Meanwhile, the roll mode went to “direct law”. In “normal” operation the pilot would have become accustomed to the Airbus FBW mode where stick movement commands roll rate and centralizing the stick gives zero roll rate (holds the bank angle). However in direct law stick movement commands roll acceleration and centralizing the stick leaves the pilot with a residual roll rate. To cancel that one must apply opposite stick motion.

The pilot found himself flying an aircraft at altitude by hand.  Due to the lower dynamic pressure and higher true airspeeds for the same equivalent airspeed (EAS), there is less damping at that altitude, so not only is the aircraft flight control system in a degraded state that is not normally seen outside of a demonstration in the simulator during initial training, but it was in a flight regime most pilots today have never “hand-flown” an aircraft at due to RVSM rules.

With such a sudden change in aircraft dynamics coupled with the low damping at altitude, it is not surprising, then, that the pilot was focused on trying to keep the wings level, which was occupying a good deal of his ability.  It also would not be unusual for a pilot to be subconsciously pulling a bit with each lateral control input.  Furthermore, the flight directors, which were biasing in and out of view, were commanding a pitch up.

The pilot pulled the controls back enough to increase the g-force momentarily.  This led to an increase in angle of attack to the stall warning threshold.  The stall warning responded with a momentary “Stall, Stall”, but discontinued before the “cricket” tone was generated.  The monitoring pilot in the left seat asked “What was that?”, but beyond that, the Air France crew did not discuss this momentary indication.  Did they just attribute it to turbulence?  Based on the lack of any secondary indications, it is very possible that they assumed that the momentary warning was connected with the lack of airspeed indication.  This is a training problem, as modern stall warning systems on transport aircraft utilize angle of attack.  However, the aircraft continued to fly normally.

Although not discussed by the crew or mentioned in the BEA report, flight tests reproducing AF 447 clearly showed buffet.  It is hypothesized that the pilot perception of buffet may be strongly linked to fuselage flexibility, so the g-forces generated by the buffet might not represent what the pilots are actually experiencing.  These might lead to a pilot mistaking buffet for turbulence.  It is also possible that, while in the turbulence, the buffet is somewhat masked, or perhaps not as salient as the turbulence itself.

Many have wondered why the crew might have ignored the stall warning in the first place. It is likely that they viewed it as a false or spurious warning.  Perhaps they just assumed it was another system failure associated with the loss of airspeed. Several crews reported that in previous probe icing events they had a single stall warning sound but ignored is as being “a blip”  Regardless, this likely had an effect on the subsequent stall warnings, as research has shown that when a system warning is perceived to be false once (accurately or not), people will ignore subsequent warnings.  As they continued to slow, the aircraft entered a stall again.  Again, it appears that the warning was still not accompanied by salient secondary indications, or, at the very least, as described previously, not the secondary indications most pilots have been trained to expect.

Regardless, it is clear that warning was essentially ignored from that point onwards.  No other discussion or mention of it occurred, even when it was repeatedly calling “Stall, Stall..cricket”  The warning was just noise at that point.  If there was buffet it would be easily masked by the turbulence as they flew through the tops of thunderstorms.  The aircraft would be experiencing a gentle pitch up due to aerodynamic factors discussed earlier, but the A330 FBW system would have just kept the pitch constant.

The stall warning continued for the next two and a half minutes.  Clearly they would have heard it so why did they not react?  Again, the most probable explanation is that they considered it a false warning.  The aircraft AoA continued to increase and it entered a “deep stall”.  The airspeed become so low and angle of attack so high that the stall warning system stopped based on the assumption that the combination would be a false indication.  At that point, a relatively dramatic nose down pitch would have been required to recover.  Transport aircraft rarely see nose-down pitches over a few degrees in normal operations, but after reaching this point, the aircraft would have required something in the neighborhood of 15 degrees nose down to start a serious recovery.  Coupled with this, though, was that as the aircraft finally fully stalled it started to descend.  Fast.

The descent rate resulted in the measured g-force falling to around 0.6 g, vacillating between that and .75 g.  Pushing forward on the controls under normal circumstances to get to 15 degrees nose down would be outside the experience of most pilots.  How many pilots would still recognize the need when they were subject to g-forces where they felt they were already falling?  Pilots are taught to “unload” to break the stall, but what if they are already “unloaded”? Under normal circumstances a stall at this altitude could require more than 5,000 feet to recover.  In this case, would have taken much more.

In addition to the confusion the crew was experiencing regarding what was happening, there was also a good amount of oscillation in roll.  This is likely due to the aerodynamics of very high angles of attack, where the flow can have some very difficult to predict effects.  The pilot flying had all he could do to try to keep the wings level.  Sadly, allowing the aircraft to “roll off” might have pushed the aircraft out of the stall, but they did not know that.

In conclusion, it should be clear that the aspects surrounding high altitude stalls are complex.  As outlined in previous articles, expectation bias and confirmation bias play their parts as well.  In truth, there was really not a lot of time to sort it all out, and simulators are not able to replicate the situation adequately.  It is hoped that this article will provide some insight and “food for thought” for pilots confronted with such a situation.

Posted in Safety | 1 Comment