Image Fusion for Dynamic Scenes

Learning-based multi-exposure fusion (MEF) methods have significantly improved in recent years. However, these methods mainly focus on static scenes and can generate ghosting artifacts when tackling more common scenarios. Limited by general camera imaging sensors’ low dynamic range (LDR), the captured image usually includes high-level noise or missing details when facing high dynamic range (HDR) scenes.

An article in IEEE Transactions on Image Processing attempts to fill this gap by creating an MEF dataset of dynamic scenes containing multi-exposure image sequences and their corresponding high-quality reference images. The authors propose a ‘static-for-dynamic’ strategy to construct such a dataset to obtain multi-exposure sequences with motions and their related reference images. Correspondingly, the authors also offer a deep dynamic MEF (DDMEF) framework to reconstruct a ghost-free high-quality image from only two differently exposed images of a dynamic scene.

Static-for-Dynamic Strategy

Over recent years, excellent research has been dedicated to the MEF task, which is outlined at the start of the article. However, the research proposed in this article focuses on the multi-exposure image fusion of dynamic scenes. Unlike the MEF of static scenes, the main challenge in the dynamic MEF task concerns handling the motion between the input images, which is more likely to appear in real applications.

The article presents two primary objectives: enabling fused images using two extreme exposure images to reach the quality of the fusion result using more multi-exposure images; and effectively addressing the motion between the input images so that the result has no ghosting artifacts. Moreover, to enable the learning of the deep neural network, the researchers built a dynamic MEF dataset with corresponding reference images.

Deep learning-based approaches usually require a large amount of data to optimize their model parameters. Since there is no real MEF dataset of dynamic scenes with ground truth, the researchers captured a multi-exposure image dataset to facilitate dynamic MEF research and generate high-quality reference images for end-to-end training and reference evaluation.

Goals and Experiments

According to the authors, the goal is to capture multi-exposure images with motion. The main problem is how to obtain their corresponding high-quality ghost-free ground truth images. In this work, researchers propose a ‘static-for-dynamic’ approach to construct an MEF dataset of dynamic scenes. The core idea is to take the fusion result of a static image sequence as the ground truth to train a model to handle the motion in dynamic image sequences.

The construction strategy of the proposed dataset.

Example scenes with different exposure levels in the dataset.

Following the reference image generation approach, the researchers adopted static MEF methods to generate high-quality images. Then, they conducted a subjective experiment to select the best fusion result as the label. To verify the quality of the reference image more objectively and quantitatively, they computed the average values of the results produced on the entire dataset.

Compared to the static MEF task, the main issue in the dynamic MEF task is the ghosting artifacts caused by the motions between the input images. The motions in the captured photos can be generally divided into global background motions arising from camera wobbles and local foreground motions of moving objects in the scene. To address these problems, the researchers propose a framework for dynamic MEF.

Alignment results based on original and pre-enhanced images.

Images captured by different cameras have differences in many aspects, such as sharpness and noise. However, according to the authors, the MEF technology focuses on the fusion of different exposures and does not change other factors of the source images. To verify the practicality and robustness of the proposed method, the researchers captured multi-exposure image sequences using different smartphones for evaluation. The proposed method reconstructs visually pleasing images with clear details and appropriate illumination. In contrast, other methods generate noticeable ghosting artifacts, noise, and distortions, demonstrating the superiority and practicality of the proposed approach.

Visual results of the proposed method and state-of-the-art MEF methods on dynamic scenes captured by different smartphones.

An example with large-scale background motion.

Extensive experiments, outlined in the article, proved the advantages of the constructed dataset over existing MEF datasets. The experiments also demonstrated the superiority of the proposed framework over state-of-the-art MEF methods for dynamic scenes. As noted by the authors, the limitation of this work is that DDMEF cannot automatically select the under-exposure or over-exposure image as the reference according to the specific scene. In the future, the researchers have plans to design a more flexible architecture to solve this problem.

Interested in learning more about Photonics? IEEE offers continuing education with the Finite Element Method for Photonics Course Program to smartly implement digital tools into your organization.

Interested in acquiring full-text access to this collection for your entire organization? Request a free demo and trial subscription for your organization.

Image Fusion for Dynamic Scenes

Share:

Related Stories

The Future of Plasmonic PICs