Significant strides have been made in computer vision over the past few years due to the recent development in deep learning, especially deep convolutional neural networks (CNNs). Based on the advances in GPU computing, innovative model architectures and large-scale dataset, CNNs have become the workhorse behind the state of the art performance for most computer vision tasks. For instance, the most advanced deep CNNs are able to achieve and even surpass human-level performance in image classification tasks. Deep CNNs have demonstrated the ability to learn very powerful image features or representations in a supervised manner. However, in spite of the impressive performance, it is still very difficult to interpret and understand the learned deep features when compared to traditional human-crafted ones. It is not very clear what has been learned in the deep features and how to apply them to other tasks like traditional image processing problems.
In this thesis, we focus on exploring deep features extracted from pretrained deep convolutional neural networks, based on which we develop new techniques to tackle different traditional image processing problems.
First we consider the task to quickly filter out irrelevant information in an image. In particular, we develop a method for exploiting object specific channel (OSC) from pretrained deep CNNs in which neurons are activated by the presence of specific objects in the input image. Building on the basic OSC features and use face detection as a specific example, we introduce a multi-scale approach to constructing robust face heatmaps for rapidly filtering out non-face regions thus significantly improving search efficiency for potential face candidates. Finally we develop a simple and compact face detectors in unconstrained settings with state of the art performance.
Second we turn to the task to produce visually pleasing images. We investigate two generative models, variational autoencoder (VAE) and generative adversarial network (GAN), and propose to construct objective functions to train generative models by incorporating pretrained deep CNNs. As a result, high quality face images can be generated with realistic facial parts like clear nose, mouth as well as the tiny texture of hair. Moreover, the learned latent vectors demonstrate the capability of capturing conceptual and semantic information of facial images, which can be used to achieve state of the art performance in facial attribute prediction.
Third we consider image information augmentation and reduction tasks. We propose a deep feature consistent principle to measure the similarity between two images in feature space. Based on this principle, we investigate several traditional image processing problems for both image information augmentation (companding and inverse halftoning) and reduction (downscaling, decolorization and HDR tone mapping). The experiments demonstrate the effectiveness of deep learning based solutions to solve these traditional low-level image processing problems. These approaches enjoy many advantages of neural network models such as easy to use and deploy, end-to-end training as a single learning problem without hand-crafted features.
Last we investigate objective methods for measuring perceptual image quality and propose a new deep feature based image quality assessment (DFB-IQA) index by measuring the inconsistency between the distorted image and the reference image in feature space. The proposed DFB-IQA index performs very well and behave consistently with subjective mean opinion scores when applied to distorted images created from a variety of different types of distortions.
Our works contribute to a growing literature that demonstrates the power of deep learning in solving traditional signal processing problems and advance the state of the art on different tasks.
|Date of Award
|1 Jul 2018
- Univerisity of Nottingham
|Guoping Qiu (Supervisor)
- deep learning
- image processing