Designing a rad-hard CubeSat onboard computer
StoryJune 06, 2018
CubeSats - miniature satellites no larger than a briefcase - are becoming increasingly popular all over the world. Economies of scale in components, subsystems, launch equipment, and logistics have already and will continue to enable many cost-effective new satellite launch ventures and projects.
There now exists a vibrant ecosystem of suppliers that provide plug-and-play CubeSat components that fit together inside the standard CubeSat form factor. Thus far, much of the technology has been based on commercial off-the-shelf (COTS) electronics, although there is a growing trend to judiciously use radiation-hardened integrated circuits that are designed to mitigate against the effects of space radiation. The goal is to improve the system reliability by ensuring that the electronics operate in a radiation-filled environment while maintaining a modest budget; CubeSats are intended to be an inexpensive alternative to traditional, higher-cost satellites.
Because of the interest in selective-component hardening as a means of improving mission success rate, a reference design was created for a CubeSat onboard computer (OBC) that uses radiation-hardened components. This reference design can be downloaded and modified by CubeSat designers to meet different mission requirements. A block diagram of the OBC is shown in Figure 1.
Figure 1: CubeSat OBC reference design block diagram.
The reference design is Pumpkin CubeSat Kit Bus-compatible, as the signals on the PC/104 connector conform to the published Pumpkin CubeSat interface specification. There are many plug-and-play boards that use this standard. In space-constrained designs, the PC104 connector is sometimes relinquished due to its size.
The OBC uses the VORAGO Technologies VA10820 ARM Cortex-M0 microcontroller, a radiation-hardened low-power device that supported with the ARM development ecosystem.
The MCU – already immunized against latch-up – provides a 50 MHz ARM Cortex-M0 core, program and data memory, general-purpose I/O (GPIO), and on-chip peripherals such as timers and serial communications (SPI [serial peripheral interface bus], UART [universal asynchronous receiver-transmitter], and inter-integrated circuit protocol [I2C]). When the system boots up, the SRAM program memory on the microcontroller is loaded from the Cypress CYPT15B102 rad-hard FRAM. Program code executes from SRAM and is protected by an error detection and correction (EDAC) subsystem and a scrub engine.
The EDAC corrects bit errors that can occur due to single-event upsets (SEUs) as the CPU fetches words from the SRAM memory. The scrub engine is a complimentary subsystem that autonomously sweeps through memory sequentially to detect and correct bit errors before the EDAC would be exposed to them. There are five syndrome bits for every byte in the 32-bit data words, making it possible to detect two bit errors per byte and correct one bit in each byte of the 32-bit memory word. This arrangement enables the correction of up to four bit errors (one per byte) per 32-bit data word.
Single-event upsets (SEU) can cause a change of state by a single ionizing particle striking a device; SEUs can affect both memory cells or logic circuits. Another radiation-mitigating feature of the MCU architecture is the implementation of dual interlocked cell (DICE) latches and triple modular redundancy (TMR) on internal registers. While the EDAC and scrub subsystems address SEUs in memory, the DICE latches and TMR implementation address SEUs in logic circuits.
The OBC uses Cypress CYPT15B102 ferroelectric random-access memory (FRAM) as it has good radiation performance and interfaces easily to the MCU via a SPI port. A second FRAM, the Cypress FM25V20A, is also implemented as a backup. The FM25V20A is a COTS automotive-grade memory with the same SPI interface. This memory can be used to provide temporary nonvolatile storage when performing in-orbit reprogramming: If the CubeSat receives a wireless program code update in orbit, both the original and new code images will be required so that the system can recover to a known good state in the event of a problem during reprogramming. The recovery function is the main reason a second FRAM device is included. The second FRAM device could of course be another rad-hard CYPT15B102, but the automotive-grade COTS device was selected to reduce cost. (If in-orbit reprogramming is not a system requirement, the second FRAM device may not be required).
A Cypress CYRS16B256 rad-hard flash device is also connected to the microcontroller on an SPI communications port. The purpose of this device is to act as a data-storage bank. Any data that is collected during the mission (for example, from sensors in the payload) can be stored in its 32 Mbytes of memory. Depending on the radiation profile expected during the mission, a designer might consider replacing this device with the COTS-equivalent integrated circuit, the Cypress S25FL256L.
For short-duration missions in low Earth orbit (LEO), designers often use COTS devices. While there is a risk of unrecoverable upsets, it is sometimes considered a trade-off against cost that is acceptable. The biggest risk to system operation is latch-up: All CMOS [complementary metal oxide semiconductor] devices are susceptible to latch-up due to ionizing radiation particle strikes. When a device latches up, a parasitic structure on the CMOS die becomes forward-biased and creates a short circuit from VDD to VSS (positive to negative). This causes a large current to flow through the device and pulls down VDD. It is therefore good practice to have a chip or circuit in the system that is latch-up-immune that will detect this condition occurring and can reset the system to resolve the latch-up condition. Normally, the VA10820 microcontroller performs this function in “selectively hardened” CubeSat systems. Note that latch-up can destroy a CMOS device despite a reset attempt, so the only safe way to really protect a system is to use fully latch-up-immune components throughout the system. This design is more expensive than using COTS and is at the crux of the CubeSat design challenge: How much risk is one willing to bear, given that mitigating against radiation effects with rad-hard devices is more expensive than using COTS?
If the microcontroller is latch-up-immune, there is at least one device that can be relied upon as the rad-hard mainstay of the radiation mitigation strategy. Another useful device that would be considered as a rad-hard mainstay would be the supervisor chip.
An Intersil ISL706A supervisor device is used in the system. This supervisor performs three important functions. The first function is to hold the MCU in reset until the power supply reaches an appropriate level to power up the MCU. The second function is to observe the system power supply as a latch-up warning monitor. If any device in the system latches up, the supply voltage will be pulled down. The ratio of the potential divider implemented with resistors R1 and R2 controls the threshold at which power fall input (PFI) on the supervisor chip triggers. For this reference design, the threshold has been set to 2.75V. (The circuit configuration is shown in Figure 2.) In the event that the 3.3V rail drops to this level, a reset will be asserted to the MCU that will in turn reset the system. In most cases, the latched-up device will recover when the system is rebooted.
Figure 2: Supervisor circuit configuration is shown.
The third function that the supervisor device performs is as an additional independent watchdog. There is already a watchdog in the MCU, a timer that is periodically reset by the firmware to ensure that the code is executing properly. If the code hangs up and the on-chip watchdog is not reset by the firmware, an interrupt is generated that will cause a chip reset. The main failure mode that would be concerning for the MCU watchdog is a loss of clock. This condition is addressed by the supervisor device, as it acts as an external watchdog that operates similarly to the MCU watchdog, using a firmware-controlled periodic toggle signal from a GPIO line on the MCU. If this signal is not toggled at least once every 1.6 seconds, the supervisor will assert a hard reset to the MCU.
The reference design is powered by a 5V supply that is sourced either from an external supply preflight power connector (when used on the bench) or from the Pumpkin CubeSat Kit Bus across the PC/104 connector. There are three voltage supply rails used in the system: 5V, 3.3V, and 1.5V. The 5V rail supplies two Intersil ISL7502SEH rad-hard LDOs [Low Drop Out regulators]. All of the I/O on the board uses 3.3V signaling, whereas 5V is required for the analog-to-digital converter and 1.5V is required for the MCU core voltage. Each of the LDOs have an enable input that is routed to the specified pins on the Pumpkin CubeSat Bus PC/104 connector. This setup enables the power supplies on the OBC reference to be controlled by the CubeSat Electrical Power System (EPS) controller board that supplies power to the entire CubeSat system.
The MCU is supplied by a rad-hard 50 MHz clock device supplied by Frequency Management. The MCU internal clock speed can be adjusted dynamically in software. It can be operated at a lower speed, which may be an option to optimize power consumption. Many CubeSat applications are characterized by long periods of relatively low activity, with bursts of high activity during communications or data sampling periods; during the low-activity periods, the MCU clock speed can be reduced to conserve power.
The MCU is connected to a Cobham Aeroflex RHD5950 analog-to-digital converter (ADC). This is a successive approximation type that has 16 channels, 14-bit resolution, and a 20 µs conversion period. The ADC channels are connected to the analog input signal lines as detailed in the Pumpkin CubeSat Kit Bus specification. One of the ADC channels monitors the system voltage supply rail and another is connected to a resistance temperature detector (RTD). The RHD5950 has single conversion and continuous conversion modes; continuous conversion is useful for oversampling, which enables improvements in resolution and noise. The ADC output pins are connected to GPIO lines on the microcontroller, with the microcontroller also controlling the ADC on-chip multiplexer to determine which analog inputs are sampled.
There are several non-radiation-hardened COTS parts implemented on the board because a rad-hard option was not available. The first such device is a UART-to-USB interface (supplied by FTDI Ltd.). The reason for inclusion of this device in the reference design is to enable a USB interface to the system that can be used on the bench for development work; this interface is not intended for use in orbit. The device will translate USB protocol from an external host to a UART interface on the microcontroller. The USB port can be used as a simple terminal interface to the MCU. The UART-to-USB device is powered only when a USB cable is plugged into the system so will not create problems in the circuit if it is affected by radiation-induced faults.
The second non-radiation-hardened COTS device that is used in the system is an HI-3110 integrated controller area network (CAN) controller and physical layer (PHY). CAN is a popular serial communications protocol used widely in automotive systems that has also found favor with CubeSat designers because of its robust differential signaling characteristics. Whereas TTL level communications interfaces such as UART, SPI, and I2C are ideal for short-hop intraboard communications, the CAN interface offers a more rugged option for interboard communications within the CubeSat system. If, for example, a sensor is located in a different physical locality, the CAN interface is a good option to communicate with it due to the high noise immunity of the differential signals provided by the PHY. Because this device is not inherently radiation-hardened, special measures are taken to monitor and control it. If it latches up, the supervisor will be triggered by the voltage drop on the supply voltage. The HI-3110 includes internal status registers that are monitored via the SPI communications interface by the MCU. The 3.3 V and 5 V power supplies to the CAN device is gated so that the MCU can disable power to the CAN device and reset it if the status register data is ambiguous or indicates that an error has occurred. The power-supply gating circuit is shown in Figure 3.
Figure 3: Power-supply gating circuit protects system against COTS failure in radiation-filled environments.
The reference design includes a JTAG connector on the board to interface with the MCU for programming and debug. A debug pod (such as a Segger J-Link) connects to the JTAG header on the board and through USB to a host computer that is running an integrated development environment (IDE) such as ARM Keil µVision or IAR Embedded Workbench. One of the benefits of using an ARM-based microcontroller is that there is a broad selection of development tools available to support it. To reprogram the FRAM, the code would first be downloaded to the MCU and then be loaded to the FRAM through the SPI connection.
All Pumpkin nanosatellites use a remove-before-flight (RBF) high-current roller-tipped lever switch. It is typically used in conjunction with an RBF pin that presses on the roller, or in an assembly that presses against a wall of a nanosatellite deployment container. This switch, included on the board, provides Common (C), Normally Open (NO), and Normally Closed (NC) terminals. These are routed to the specified CubeSat Kit Bus pins on the PC/104 connector.
There are typically two versions of radiation-hardened devices available from suppliers – a prototype grade and a flight grade. Flight grade devices are screened to a higher level than prototypes although they are form, fit, and functionally identical and use the same die. Prototype-grade parts are usually around half of the price of flight-grade parts and for that reason were selected for use on this reference design.
Different specifications are used to quantify how an IC will perform in radiation-filled conditions including single-event latchup (SEL) and single-event upset (SEU). These are important to understand how often a device can be expected to exhibit bit errors in memory and logic errors due to ionizing particle strikes. The radiation specification most widely discussed for CubeSats is total ionizing dose (TID): This is a measure of the amount of energy that can be absorbed in matter (in this case, the matter is silicon) and is denoted in Krad(Si), or kilo units of radiation absorbed dose (in silicon). TID accumulates over time and results in increased source-drain leakage in the MOS transistors in the IC as the device oxide builds up an accumulated charge. There is also an expansion of the depletion region between PMOS and NMOS-type devices. TID accumulation will result in increased leakage current; eventually, the CMOS device will cease functioning as the threshold voltage is pulled down.
CubeSat designers use the IC TID specifications to estimate how long a CubeSat is likely to function before the ICs within the structure will succumb to the effects of TID. This length of time depends on orbit altitude, orientation, and time. In LEO where CubeSats typically fly, the source of TID will be mainly electrons and protons. Details of the TID performance of the ICs on the reference design are given in Table 1.
Table 1: TID performance of OBC reference design ICs.
Ross Bannatyne is director of marketing for VORAGO Technologies, based in Austin, Texas. He was educated at the University of Edinburgh and the University of Texas at Austin. Ross has published a college text called “Using Microprocessors and Microcomputers” and a book on automotive electronics called “Electronic Control Systems” (published by the Society of Automotive Engineers); he also holds patents in failsafe electronic systems and microcontroller development tools. Readers may reach Ross at [email protected].
Vorago Technologies www.voragotech.com