The advent of augmented reality (AR) systems is no longer the reserve of fighter pilots with heads-up displays and soldiers with wearable AR goggles; AR is now being used in military surveillance systems as well. Superimposing context-dependent graphical information as an overlay to camera video can aid in the interpretation of a complex surveillance situation, enabling a faster response to threats, clarity of decision-making, and improved situational awareness.

The technological challenge regarding the use of AR in surveillance arises from the effective joining or fusing of information to permit observations from one sensor to benefit from information provided by another sensor. The goal of such an arrangement is delivering an enhanced perception of reality while at the same time reducing the cognitive load on the operator.

Information from sensors such as radar can be used to build a model of target classification and behavior that can be added onto the live video images with information positioned and filtered to aid the interpretation of the camera data. Military security solutions commonly integrate daylight and thermal cameras with specialist sensors to meet mission needs. For example, a security application overlooking a coastal area may include a marine radar to provide detection of incoming surface or air targets beyond the range of cameras. In this situation, initial detection of the target with radar may drive the camera to the appropriate position to observe the target with a long-range camera.

Additionally, radio transmissions – such as AIS (Automatic Identification System) and ADS-B (Automatic Dependent Surveillance-Broadcast) – provide useful information concerning the identity and route of cooperating commercial ships and aircraft. The simplest combination of sensors occurs when one sensor simply cues another. For example, a radar detector may initially observe the target. The control process then drives the camera to the target’s location. The desired position (pan and tilt) of the camera is calculated using simple geometry from the observed range and azimuth of the target and the camera’s location. After the camera is positioned in this way, the radar sensor need contribute nothing more. There are several options for the subsequent adjustment of the camera’s position:

1. No adjustment: The camera simply points at the position of the target, as reported from the radar, and that is sufficient.

2. User-controlled: Any subsequent movement of the camera, for example to follow the target, is handled by actions of the user from an on-screen or physical joystick.

3. Radar-directed adjustment: The camera’s position may be subsequently adjusted using updated information from the radar processing subsystem. This technique is commonly called slew-to-cue.

4. Video tracking: The position of the camera may be adjusted using a video tracker to calculate the position of the target in the camera’s field of view, and compute an error vector to move the camera to adjust the target center.

Method 3 (radar-directed adjustment) continues to use information from the radar in the presentation of the camera video. The radar information is used to recenter the camera according to the detection from the radar.

For all the above camera-adjusting methods, it is still possible to enhance the presentation of the camera video by incorporating radar-derived information as a graphical overlay. All of this together is called augmented vision.

Augmented vision improves decision-making

A video picture from a camera may be overlaid with static text to report useful information from the camera or environment. This might include the details about the camera’s state, the time and date, and status of camera controls. A whole new dimension of possibilities is opened by also overlaying contextual information relating to targets of interest in the scene, where the additional information may have come from unrelated sensors such as radar or from target-derived transmissions such as AIS or ADS-B. (Figure 1.)

Figure 1: Augmented video: Radar data is processed to create tracks, which are fused with AIS reports and then overlaid on the camera video. Information from the fused tracks aids the interpretation of the video image.

By presenting the graphical overlays at a screen location that aligns with the observed targets in the video, the operator is offered an enhanced interpretation – augmented reality – thereby improving decision-making without increasing the cognitive load. The real-time updates of the target derived from the radar sensor, for example, are used to update the real-time presentation of the overlay. As the camera moves, the screen location of the graphic is adjusted to ensure that the augmented graphics are appropriately positioned to align with the target in the camera’s view.

As an example, consider a target that is being observed by radar and that the processing of the radar data by a target tracker is enabling the target’s motion to be characterized as well. The target can then be represented by size, position, speed, and direction. It may also be possible to make an initial classification of the target type based on these parameters. The size (radar cross-section), behavior, and speed of a target, for example, can be used to suggest a classification, such as swimmer, buoy, unmanned autonomous vessel (UAV), RIB, small boat, larger boat, helicopter, light aircraft, etc.

The target information can then be used to present the AR overlay for the camera display. Knowing the orientation of the camera, the field of view represented by the window, and the absolute angle of the track, the appropriate position for the target symbol can be calculated. With a graphical symbol drawn at the correct location, the video from the camera can be overlaid with the related state information derived from the radar. Significantly, the position of this information must be recalculated in real time to ensure that the current position of the camera is used. That position may be being adjusted by an operator, by an automatic slew-to-cue process, or by a closed-loop stabilization process if the camera is mounted on a moving platform.

The principle of the processing is thus: There are two data processing streams. The camera video is displayed and then overlaid with graphics information derived from the radar processing. The known angle and range of the target measured by the radar is displayed at a window location that is sensitive to the angle of the camera, and continuously adjusted as the camera angle is changed (by whatever means). As a result, the user observes (Figure 2) the real-time video imagery overlaid by relevant contextual information relating to the target.

Figure 2: The video from the camera is overlaid with relevant target-specific data derived from the radar and other sensors.

Reducing the cognitive load with AR

Augmented vision is implemented within Cambridge Pixel’s RadarWatch display software to aid the interpretation of camera video by showing relevant target data as an overlay. The camera may be moved by either an operator, by a video tracker, or by slew-to-cue adjustment from the radar updates. The radar-derived information is constantly updated to reflect the most recent fused information from the radar and radio transmissions. This permits speed and course to be displayed, as well as relevant data that comes from the associated AIS record, such as ship ID, destination port, cargo, etc.

RadarWatch provides an integrated display of maps, radar and camera video, with primary radar tracks fused with AIS and then displayed as an overlay to camera video. The software supports the specification of complex alarm criteria based on the position of targets in any combination of areas of interest, near to the coast line, designated locations, or other targets. When an alarm is triggered, the actions may include camera cueing, initiation of recording, and audible and visual signaling. The display and overlays deliver an enhanced perception of reality and offer a reduced cognitive load for the user. (Figure 3.)

Figure 3: RadarWatch integrates radar and camera video, with augmented reality adding radar-derived information to aid the interpretation of the camera video.

Augmented vision clearly offers improvements in the interpretation of complex sensor data in military security applications, which enables more efficient classification of threats and faster detection of targets needing assistance. The key element with such systems is to process and display this additional sensor data intelligently to help the operator to make faster, clearer, and better-informed decisions.

David G. Johnson is Cambridge Pixel’s technical director and has more than 25 years of experience working in radar processing and display systems. He holds a B.Sc. and Ph.D. in Electronic Engineering from the University of Hull in the U.K. David can be reached at dave@cambridgepixel.com.

Cambridge Pixel • www.cambridgepixel.com