Optimizing avionics reliability with dissimilar redundant architectures
StoryDecember 06, 2018
The potential consequences and acceptable probability of failure of an avionics system dictate the Design Assurance Level (DAL) that must be met in order for it to be certified for flight. The key computing elements of a system - such as the single-board computers (SBCs), graphics cards, and operating systems built into a flight-control computer or flight display - must all be designed with safety in mind and endure stringent testing to prove they can meet the required DAL. ARP4754 (Guidelines for Development of Civil Aircraft and Systems - Figure 1) is used by avionics designers as they allocate functions to systems and assign DALs to hardware and software for their safety-certifiable systems.
The potential consequences and acceptable probability of failure of an avionics system dictate the Design Assurance Level (DAL) that must be met in order for it to be certified for flight. The key computing elements of a system – such as the single-board computers (SBCs), graphics cards, and operating systems built into a flight-control computer or flight display – must all be designed with safety in mind and endure stringent testing to prove they can meet the required DAL. ARP4754 (Guidelines for Development of Civil Aircraft and Systems – Figure 1) is used by avionics designers as they allocate functions to systems and assign DALs to hardware and software for their safety-certifiable systems.
Figure 1: ARP4754 assigns Design Assurance Levels to hardware and software used in safety-certifiable systems.
For instance, an SBC designed for use in a flight-control computer must be certifiable to DO-254/DO-178C DAL A, which requires a <1 in 10-9 probability of failure per flight hour. Meeting the DAL A level of reliability is a formidable challenge. Moreover, no matter how reliable the electronics are, unpredictable external factors can still result in system failure. For example, a single-channel flight-control system is vulnerable to a single point of failure that can cause the entire system to malfunction.
Consider the case of an unmanned aerial vehicle (UAV) experiencing a bird strike while in flight: If the accident blocks one of the UAV’s probes, it could become completely inoperative or cause it to transmit Hazardous Misleading Information (HMI) to the flight-control computer. Either condition could potentially prevent the flight-control computer from properly calculating desired data for the components under its control, which could ultimately lead to a disaster.
For safety-certification purposes, an avionics system designer is responsible for demonstrating that the aircraft can withstand the complete loss of the main active system. Because of the severe consequences resulting from a single point of failure, hardware redundancy is critical in DAL A systems. But if the aircraft uses a redundant architecture built with similar channels, that system will still be susceptible to common mode failures that can cause all channels to fail in the same way. Common mode failures can be unpredictable and unpreventable, like a lightning strike, electromagnetic interference, a fire, or an explosion. Software bugs are another form of common mode failure that are hard to protect against; because complex aviation applications are built from tens of thousands of lines of code, it’s practically impossible to test for and prevent every possible software bug or combination of events.
Dissimilar redundancy provides a more complex scheme that can mitigate common mode failures through the use of two or more different processor types with dissimilar software, and/or a backup system that uses different sensors and controls from the main active system. By running different operating systems and applications on dissimilar hardware, system designers can add an extra layer of protection against software bugs that would impact the different hardware architectures in similar ways.
An example of a highly redundant system can be found in NASA’s space shuttle fleet. The computers in the space shuttle control flight and mission functions, and have been designed to handle several levels of component failure without compromising mission success. This high level of fault tolerance is achieved through five computers, four of which run identical software. The fifth computer is an independent backup running different software to protect against generic software problems that may affect the quad-redundant set. In other examples, the Airbus A320 aircraft uses five dissimilar computers running four dissimilar software packages, and the Boeing 777 is designed with a high level of redundancy, featuring three primary flight computers with dissimilar processors that each transmit data through an independent channel, resulting in three unique control paths.
Building a fault-tolerant redundant architecture
In recent years, embedded hardware vendors have brought the benefits of commercially designed solutions to avionics design by providing DO-254 safety-certifiable OpenVPX SBCs and other modules, each supported with the required set of data artifacts. This approach has helped to reduce the time, effort, and cost involved in designing a DO-254 system compared to the previously required expensive custom-built modules. Curtiss-Wright recently introduced DO-254 certifiable SBCs powered by all three of the leading architectures, Intel, Power Architecture, and Arm. With the introduction of the NXP Layerscape LS1043A Arm quad-core based VPX3-1703, the industry’s first safety-certifiable 3U OpenVPX Arm SBC, avionics system designers now have a viable path forward for developing dissimilar redundant solutions.
Rick Hearn is Product Manager, Safety-Certifiable Solutions, at Curtiss-Wright Defense Solutions.
Curtiss-Wright Defense Solutions www.curtisswrightds.com