FPGA-powered display controllers enhance ISR video in real time

Story

May 25, 2012

The military relies on video imagery for situational awareness, but image quality is often so poor that operators can miss important details. Outfitting display controllers with FPGAs that run image enhancement algorithms in real time gives viewers a much better picture.

The backbone of modern defense capabilities, Intelligence, Surveillance, and Reconnaissance (ISR) relies on a robust and diverse network of integrated sensors, aircraft, and manpower. The value of this network ultimately relies on human capability to clearly see sensor imagery, discern important details, and take decisive action. In the field, we have little or no control over the lighting and environmental conditions under which images are acquired from a sensor. It is possible, however, to give more control to the person viewing live sensor imagery, by allowing them to fine-tune video imagery on-the-fly to pull out more information.

Enhancing video in real time requires tremendous computational throughput. It requires applying sophisticated image processing algorithms to incoming video streams without introducing delays. High-performance Field Programmable Gate Arrays (FPGAs) provide an ideal platform that allows software algorithms to be implemented using parallel computing techniques. When embedded into an intelligent display controller, these algorithms give operators maximum control over image quality, and result in dramatically better image clarity.

Clarity is in the eye of the beholder

Full-Motion Video (FMV) is the tool of choice for military situational awareness. Automated video recording is featured on virtually all military vehicles, including manned vessels such as fighter jets, trucks, and tanks, as well as Unmanned Aircraft Systems (UASs). Producing high-quality imagery on a mobile platform poses a number of challenges. In addition to issues related to camera motion and the resulting image perspectives, the quality of the video imagery can also be compromised by poor environmental conditions, data link degradations, and bandwidth limitations. Atmospheric factors such as poor lighting at dawn, dusk, or nighttime, and adverse weather, including sandstorms and variable clouds, can obscure important details.

Sensor image quality, however, is not the only problem. The conditions under which the video is viewed vary widely and thus present another set of challenges. For example, video streams may be viewed in bright sunlight, under water, or in a dark cave with a headlamp shining on the screen. Because of this, there is a distinct advantage in providing image enhancement capability within the display itself, rather than at the sensor or elsewhere on the network.

The only way to ensure a good image quality is to give the viewer the ability to adjust the picture for their needs. The best way to accomplish this is to bring real-time video enhancement to the tactical edge. FPGAs offer the performance, design flexibility, and resilience needed to build this capability directly into the display controller.

A video controller with real-time image enhancement

All display controllers perform basic image processing, which means they take in video at a certain resolution and display it in the display’s native resolution. For example, if it’s a 1,920 x 1,080 display, it needs to receive 1,920 x 1,080 pixels for every frame. However, there is no guarantee that a user is going to plug in a raster that matches that; in fact, odds are he won’t. Instead, the incoming video stream might be formatted 1,024 x 768 or similarly. Video processing is the act of scaling, which is converting an incoming video signal from one size or resolution to another in order to work with the display panel. This is what is usually referred to as “video processing,” and it is a minor feat compared to video enhancement.

Video enhancement begins where video processing ends. A video controller designed for real-time video enhancement might start with an off-the-shelf video processing chip or a purpose-built ASIC that does the scaling and basic image processing up front. Once that operation is complete, the video stream would then be handed off to a special purpose processor such as an FPGA for enhancement.

There is, of course, the option of combining both the video processing and video enhancing functions in a single ASIC. In fact, that is what manufacturers of consumer television often do. However, this implementation is best suited for rudimentary video enhancement, such as edge sharpening, and leaves little room for sophisticated image enhancement algorithms. With an FPGA that is dedicated to real-time video enhancement built into the display controller, it is possible to reach beyond conventional display functionality and deliver advanced enhancement capabilities (Figure 1).

Figure 1: Display controller with built-in, real-time video enhancement: A video processing chip formats the incoming video stream to match display requirements. The FPGA runs image enhancement algorithms to achieve dramatically better image clarity.

(Click graphic to zoom by 1.9x)

Amazing algorithms are computationally intensive

Anyone familiar with photo editing programs, such as Adobe Photoshop, can appreciate the power of software algorithms for enhancing still images. Using sophisticated software algorithms to apply mathematical functions to the image matrix, it is possible to reveal hidden layers of visual information without losing detail. This is a purely mathematical approach that utilizes all of the available image information, including portions that are not normally visible to the human eye.

Over the past decades, a large body of image processing algorithms has been developed using techniques including histogram manipulation, convolution, morphology, over- and undersampling, quantization, and spectral processing, including Fourier transforms and Discrete Cosine Transforms (DCTs). These algorithms tend to be computationally intensive. Conventional processor technology does not offer the performance necessary to keep up with the demands of FMV at up to 60 frames per second (fps), or 1 frame every 16.67 milliseconds. Processing a Standard-Definition (SD) video stream requires about 150 to 200 Gigaflops, while a 1,080p stream requires about 1.2 Teraflops. This is where FPGAs come into play.

Convolution kernel filtering at work

When image enhancement algorithms are rewritten using parallel processing techniques and ported to an FPGA, it is possible to dramatically enhance ISR video in real time. Of the many types of image enhancement algorithms, spatial convolution kernel filtering produces the most dramatic results.

While the underlying mathematics of convolution filtering are complex, performing an image convolution operation is straightforward. A convolution kernel generates a new pixel value based on the relationship between the value of the pixel of interest, and the values of those that surround it. In convolution, two functions are overlaid and multiplied by one another. One of the functions is the video frame image and the other is a convolution kernel. The frame image is represented by a large array of numbers that are pixel values in x and y axes. The convolution kernel is a smaller array, or a mask where values are assigned based on the desired filtering function, for example, blur, sharpen, and edge detection. The size of this array, called the kernel size, determines how many neighboring pixels will be used to generate a new pixel. In convolution, the kernel operates on the image to create one new pixel each time the mask is applied, and therefore the operation must be repeated for every pixel in the image (Figure 2).

Figure 2: Convolution kernel mask operation: The source pixel is replaced by a weighted average of itself and its neighboring pixels.

(Click graphic to zoom by 1.9x)

Large kernel yields better results

Convolutions are computationally intensive and therefore most implementations use only small kernels (3 x 3, 9 x 9, 16 x 16). However, using unique, nontraditional programming techniques, it is possible to implement very large convolution kernels that produce dramatically better results. The reason a very large kernel produces better results has to do with the range and variations in brightness over a given area, which is referred to as spatial frequency.

By considering the data in a large neighborhood centered around each pixel as it is being processed, a large kernel includes a much greater range of spatial frequencies. Traditional small kernel processing can only enhance details in the very highest spatial frequencies, which typically contain little of the spectral content (full range of color) of the image, and is where noise is prevalent. Hence, small kernel processors must employ high gain to have much noticeable effect on the image. High gain tends to produce sharp outlining artifacts and increases visible noise. Large kernel processing (operating on much more of the “meat” of the image) can produce dramatic results with much lower gain, with the additional benefits of large area shading, yielding much more natural-appearing images with increased local contrast, added dimensionality, and improved visibility of subtle details and features.

One large kernel convolution algorithm, designed to clarify the image by removing haze and enhancing image detail, uses a 400 x 400 kernel. This clarifier algorithm works by solving a mathematical equation that relates a model of a “perfect image,” to the measured imperfect image captured by the sensor camera. The technology works backwards, stripping corrupting noise and image blur while simultaneously adjusting the intensity of each pixel until the simplest image that fits the real-time data emerges. The concept is that because it is known that environmental factors distort the image, if it is known how the distortion is created, then it can be undone. Other technologies use methods that strip out distortions and get close to the true image, but stop there. In contrast, this method takes a step further by continuing to apply the algorithm to the image until it is as close to the perfect image as possible. Thus, it is able to strip out all unnecessary data that is not part of the true image. Remarkable clarity is achieved once the environmental distortions are removed and as more of the real image reveals itself (Figure 3).

Figure 3: Algorithms reveal unexpected detail: Picture-in-Picture shows the remarkable clarity achieved once the environmental distortions are removed.

(Click graphic to zoom by 1.9x)

Striving for a perfect image

FPGAs unlock the door to a vast array of sophisticated algorithms that can be used to enhance ISR video in real time. FPGAs are computational workhorses and well suited to military video display controller applications. They can withstand harsh environments and meet exacting military requirements for ruggedness, temperature tolerances, reliability, and a guaranteed long product lifespan. Because they are reprogrammable, FPGAs enable design flexibility so that a display controller can be readily adapted to changing video standards, or special mission requirements. Furthermore, once deployed, FPGA-based display controllers can be field upgraded to add additional features and new image enhancement algorithms.

Jason Wade is the Vice President of Product Marketing and Sales at Z Microsystems. Previously, he was the company’s Director of Engineering. Jason regularly works with the USAF and UAS suppliers to help solve technical problems that enhance UAS performance. Jason earned Bachelor of Science degrees in Applied Mathematics and Physics from UCLA and an MBA in Technology Management from UC Davis. Jason can be contacted at jason.wade@zmicro.com.

Randall Millar is Vice President of Engineering at Z Microsystems and oversees all facets of product development from conceptual design through delivery to the customer. “Randy” has extensive technical experience in the design of video processing systems and has been granted a number of patents for his work on real-time enhancement algorithms for the medical, military, and consumer markets. He earned his degree in Electrical Engineering at UCSD. He can be contacted at randall.millar@zmicro.com.

Z Microsystems 858-831-7000 www.zmicro.com