Thermal cameras like forward looking infrared (FLIR) sensors are actively deployed on aerial and ground vehicles, in watch towers, and at check points for surveillance purposes. More recently, thermal cameras have become available for use as body-worn cameras. The ability to perform automatic face recognition at night using such thermal cameras is beneficial for informing a soldier that an individual is someone of interest, like someone who may be on a watch list.

A conceptual illustration for thermal-to-visible synthesis for interoperability with existing visible-based facial recognition systems. (Eric Proctor, William Parks, and Benjamin S. Riggan)

An artificial intelligence and machine learning technique was developed that produces a visible face image from a thermal image of a person’s face captured in low-light or nighttime conditions. This development could lead to enhanced real-time biometrics and post-mission forensic analysis for covert nighttime operations.

This technology enables matching between thermal face images and existing biometric face databases/watch lists that only contain visible face imagery. The technology provides a way for humans to visually compare visible and thermal facial imagery through thermal-to-visible face synthesis.

Under nighttime and low-light conditions, there is insufficient light for a conventional camera to capture facial imagery for recognition without active illumination, such as a flash or spotlight, that would give away the position of such surveillance cameras; however, thermal cameras that capture the heat signature naturally emanating from living skin tissue are ideal for such conditions. When using thermal cameras to capture facial imagery, the main challenge is that the captured thermal image must be matched against a watch list or gallery that only contains conventional visible imagery from known persons of interest. Therefore, the problem becomes what is referred to as cross-spectrum, or heterogeneous, face recognition. In this case, facial probe imagery acquired in one modality is matched against a gallery database acquired using a different imaging modality.

This approach leverages advanced domain adaptation techniques based on deep neural networks. The fundamental approach is composed of two key parts: a non-linear regression model that maps a given thermal image into a corresponding visible latent representation, and an optimization problem that projects the latent projection back into the image space.

Combining global information (such as the features from across the entire face) and local information (such as features from discriminative fiducial regions; for example, eyes, nose, and mouth) enhanced the discriminability of the synthesized imagery. The thermal-to-visible mapped representations from both global and local regions in the thermal face signature could be used in conjunction to synthesize a refined visible face image.

The optimization problem for synthesizing an image attempts to jointly preserve the shape of the entire face and appearance of the local fiducial details. Using the synthesized thermal-to-visible imagery and existing visible gallery imagery, face verification experiments were performed using a common, open-source, deep neural network architecture for face recognition. The architecture used is explicitly designed for visible-based face recognition. The approach achieved better verification performance than a generative adversarial network-based approach, which previously showed photorealistic properties. The approach preserves identity information to enhance discriminability; for example, increased recognition accuracy for both automatic face recognition algorithms and human adjudication.

For more information, contact the Public Affairs Office at 301-394-3590.