Streaming real-time video with CPUs/GPUs

Story

October 12, 2011

Dr. David G. Johnson

Cambridge Pixel

The distribution of sensor data to multiple console displays can be achieved using standard Ethernet networking and off-the-shelf 3D gaming CPU/GPU hardware to support software-based decompression and display composition. This reduces the dependency on proprietary hardware technologies for video processing and display systems.

Modern computing platforms have evolved substantially to meet the needs of 3D gaming markets. As a result, they now offer a high-performance computing solution that is ideally suited to the demands of streaming real-time video and radar data from sensors to displays.

Adopting industry-standard processing and graphics architectures reduces initial system costs, and future technology enhancements are simplified by a reduced dependence on specialist proprietary hardware. Flexible software running across the heterogeneous CPU and GPU hardware is also key. This software is readily moved between different vendors’ processing hardware and can accommodate upgrades in processing and display capabilities from the road maps of industry-standard processing and display technologies (see Sidebar 1). To understand this shifting paradigm, video decompression and display, along with software and middleware interfacing, are discussed.

Sidebar 1: Choices of compressed data rates versus resource allocation need to be considered to ensure timely data delivery onto display consoles.

(Click graphic to zoom)

Video decompression and display

The continued evolution of Graphics Processor Units (GPUs) such as those used in 3D gaming markets has enabled a modern display client to handle sensor decompression and display in software. Using standard client display hardware, configured through software to fulfill different operational needs, means more commonality, fewer variants, and reduced system costs. Full H.264 decoding of multiple channels – along with radar decompression, scan conversion, and multi-window display – can all be achieved with industry-standard hardware using standard CPU-plus-GPU technologies. The flexibility to use the GPU for multiple applications significantly simplifies system architectures because a common display architecture can be used for radar, video, and combined display positions.

With the compressed sensor data distributed using multicast protocols, the network loading is unaffected by the addition of extra display clients. Low-cost display positions, based on PCs or SBCs, can implement complex, multi-console, real-time displays of video and radar, and software can readily be reconfigured between different operational roles.

Modern graphics processors, such as those provided by NVIDIA and AMD (incorporating ATI), provide sophisticated processing and display capabilities, which have now evolved to blur the distinction between the CPU and the GPU. Additionally, software is now evolving to permit the programmer to write code that will be executed on the CPU or the GPU under the choice of operating environment. This allows intensive operations on data sets to exploit the multiple processors on the GPU, with the CPU handling the complex sequential code, input/output, and system administration.

Even though GPUs have the potential for huge throughput when calculations can be parallelized, many problems, even compute-intensive ones, are hard to express in a way that the GPU can exploit. A multicore CPU running at 3 GHz is no slouch, so it is often more efficient just to have the CPU process the data than to figure out how to employ the GPU and then transfer the data in and out for processing. Moving the data in and out of the GPU and synchronizing that transfer with processing on the CPU might negate any processing gain that the parallel processing can provide. In many cases, the overhead of transferring the data and synchronizing the handover of results back to the CPU is prohibitive, and modeling and quantifying this prove very difficult.

In the case of decompressing H.264 video, the GPU provides an ideal processing platform. The compressed and hence relatively low data rate (for example, 20 Mbps for an HD video signal) input data is transferred from the CPU to the GPU. After compression, the data can remain in GPU memory, ready for transfer to the display window. In this way, the otherwise expensive operations (in terms of both memory transfer and need to synchronize back the CPU) can be avoided. The CPU is responsible for scheduling the transfer of video data from off-screen memory into a display window, optionally combining the video data with overlays to add symbology into the video window. This process allows the client display application to create graphical layers that appear as overlays to the video (crosshairs, target information, geographical features, and so on) and have the final display be composed of multiple, independently updating layers – all in real time. Until recently, this sort of multilayer, real-time video system required highly specialized hardware products; however, 3D gaming technology now enables this CPU-plus-GPU compression and display. The capability to implement this in industry-standard hardware is a significant development. The data paths are shown in the diagram of Figure 1.

Figure 1: Camera video captured and compressed with server-side hardware can be decompressed and displayed with standard GPU-plus-CPU architectures, with optional annotation provided by the client display compositing.

(Click graphic to zoom)

In a practical implementation of a combined video and radar distribution system, a server captures data with cameras and radar sensors. This data is compressed by the acquisition servers and distributed using multicast network packets to any number of consoles. Since the raw camera and radar data are presented on the network, each console can select any combination of the available data. Additional consoles do not affect the network bandwidth, which is a function only of the number of distributed sensors. A network switch is responsible for interfacing the clients to the servers. A client console can be dedicated to the display of radar or video or show both on two heads of a single display position. In the example shown, a client display shows three windows of radar video on the primary head and two real-time video windows on the second head. For the camera display, the H.264 data is decompressed inside the GPU and then scaled to fit the output window. For the radar display, the compressed radar video is decompressed using the CPU and then scan converted and displayed with graphics in each of three PPI windows at up to 1,920 x 1,200 resolution. This entire client processing occurs on a mid-range hardware configuration with less than a 10 percent CPU load. Additional clients on the network maintain their own independent display presentation of radar and video.

The software designed for radar and video ensures that extremely cost-effective and interchangeable hardware can be used across a range of display positions for security, command and control, and fire control applications. The emphasis on software and the elimination of proprietary hardware ensure that future upgrades of the equipment can employ mainstream computing and graphics components. Evolution of these components will enhance the performance, resolution, and data rates that can be handled with the same software architecture.

Interfacing through software/middleware

With an industry-standard hardware-processing platform to provide the CPU and GPU resources, the software that implements the scenario is a combination of application and middleware. The middleware/software provides the components that connect the application layer to the drivers of the graphics and capture hardware, handling network distribution, quality of service, buffering, priorities, and display compositing. Cambridge Pixel has developed a set of server and API modules in its SPx integrated radar processing and display software family that provides the programming API for sensor-to-display capture, compression, distribution, processing, and display of radar and video sensor data. The middleware permits capture and compression from a wide range of sensor types, with hardware cards from third-party manufacturers such as the Tech Source’s Condor VC 100x XMC card, standard network cameras, and RGB devices using frame grabbers provided by Matrox. This ability to interface to a wide range of third-party sensors and hardware provides significant flexibility and cost benefits. Distribution of video can be handled with the standard SPx AV Server application, or a custom server can be built using the integrated radar processing and display software library. On the client side, the software provides the interface software between the application and the hardware, permitting the GPU to be exploited for video compression and display processing. Where radar is displayed, the integrated radar processing and display software handles the radar scan conversion and display mixing to support high-resolution (up to 1,920 x 1,200) console displays.

A future-proof system architecture

3D gaming GPUs provide a general-purpose processor that is closely coupled to the display processing, so that once video has been decompressed, it can be transferred to the display window within the confines of the GPU. A modest CPU and GPU combination can handle simultaneous multiradar and video display, along with application graphics, to provide a versatile multiscreen, multiwindow, and multilayer display capability. The replacement of proprietary hardware by high-performance, low-cost commercial processing and graphics devices, coupled with software that can exploit the capabilities of these devices, promises significant savings during initial deployment and lifetime maintenance. The market will develop further as graphics move towards general-purpose processing (NVIDIA’s road map) and processors integrate graphics (Intel’s and AMD’s road maps).

Dr. David G. Johnson is Technical Director at Cambridge Pixel. He holds a BSc Electronic Engineering degree and a PhD in Sensor Technology from the University of Hull in the UK. He has worked extensively in image processing, radar display systems, and graphics applications at GEC, Primagraphics, and Curtiss-Wright Controls Embedded Computing. He can be reached at dave@cambridgepixel.com.

Cambridge Pixel +44 (0) 1763 852749 www.cambridgepixel.com