Facilitating deep learning techniques with HPEC

Blog

October 31, 2016

Dr. Mohamed Bergach

Kontron

The long-sought era of machine learning in finally at hand. The potential benefit to the warfighter of deep learning techniques both enormous and profound. With defense systems trending towards greater application autonomy, deep learning techniques too complex to implement with more traditional processing technologies can now help to significantly drive advancements in on-platform processing of streaming signal or image data. These techniques are proving useful for pattern recognition tasks such as natural language processing and image feature detection, producing highly reliable autonomous decisions based on huge data sets.

Accelerating deep learning’s application for defense systems are available technologies -- the number crunching capabilities of the latest very large FPGAs, power-efficient graphics processor units (GPUs), and advanced SIMD [single instruction, multiple data] processing units tied to flexible multi-core processors. By surpassing processing limitations that would have once made deep learning architectures impractical for any sort of real-time application, deep learning algorithms can be readily satisfied today even in size, weight, and power (SWaP)-constrained systems by the advancements delivered from new high-performance embedded computing (HPEC) platforms. Further definition of how deep learning algorithms can be applied to solve an application’s particular problem is an ongoing challenge. Therefore, technology suppliers must be able to tailor and refine HPEC-based platforms so that they can be easily adapted to the needs of deep learning applications.

Understanding the basics of how deep learning works helps illustrate its positive force for the warfighter. Applications can “learn” by taking any signal (an observation) gathered by a variety of sensors (image, sound, GPS position, radar, etc.), and represent it in an abstract way, or as features such as shapes, corners, patterns, and more. These abstractions are made up of deep neural nets (DNNs, or dozens of levels of processing layers). Each layer processes data based on a particular type of feature and provides the result to the next layer. Results can be impressive and sometimes better than human-handcrafted solutions, optimizing applications such as face recognition, image registration, natural language processing, and fraud detection.

Because the network must be “trained,” significant computation must be applied, where information is weighted and optimized numerous times in order to reduce potential for errors. As a result, the learning phase is typically performed in datacenters operating non-stop. Each training result is a snapshot. In a [military-aerospace] setting, these snapshots would then be deployed on the actual embedded HPEC system for testing. The process repeats continuously, with the expectation that each snapshot will respond better than the previous one.

It is possible to build modular HPEC systems optimized for deep learning applications with readily available processing-intensive platforms based on the Intel Xeon Processor D-1540 (Broadwell DE). These systems fully utilize their eight cores, with each core having two AVX2 units, to simultaneously process eight floating point FMA (fused multiply/accumulate) operations. In other words, eight cores can perform 128 floating point operations with each clock cycle. The Intel Xeon Phi co-processors increase this further. Offering 72 cores, each with two AVX-512 units processing 16 FMA operations per clock each, for a total of 2304 FMA operations. Another plus is that Intel architecture ensures binary compatibility with each subsequent generation of 64-bit processors, effectively protecting software investments from any future incompatibilities.

In addition, OpenCL is coming on strong, quickly becoming the go-to standard for heterogeneous computing. Its rich and expressive API manages data flow and computational objects, and helps ensure portability of source code over different platforms like GPUs, CPUs, and FPGAs. VPX-based boards and platforms add value as well, helping accommodate the widest range of applications by delivering high-speed/low-latency communication via the backplane with PCIe Gen3 or 10 Gigabit Ethernet links.

Fueled by today’s powerful and feature-rich HPEC platforms, deep learning applications can easily sift through huge data streams from the military’s large signal and image processing systems. Consider the impact of this technology for applications that must continually search for either signals or targets of interest. Deep learning can be the answer for proactively hunting threats and autonomously deploying active protection systems. Enabled by HPEC platforms and driven by the need for smart autonomy in defense systems, deep learning techniques are likely to play an important new role in the military’s future operations strategy.