生成对抗式网络及其医学影像应用研究综述

Translated title of the contribution: A review of generative adversarial networks and the application in medical image

Yinglin Zhang, Yan Hu, Risa Higashita, Jiang Liu

Research output: Journal PublicationReview articlepeer-review

1 Citation (Scopus)

Abstract

The generative adversarial network (GAN) consists of a generator based on the data distribution learning and an identified sample's authenticity discriminator. They learn from each other gradually in the process of confrontation. The network enables the deep learning method to learn the loss function automatically and reduces expertise dependence. It has been widely used in natural image processing, and it is also a promising solution for related problems in medical image processing field. This paper aims to bridge the gap between GAN and specific medical field problems and point out the future improvement directions. First, the basic principle of GAN is issued. Secondly, we review the latest medical images research on data augmentation, modality migration, image segmentation, and denoising; analyze the advantage and disadvantage of each method and the scope of application. Next, the current quality assessment is summarized. At the end, the research development, issue, and future improvement direction of GAN on medical image are summarized. GAN theoretical study focus on three aspects of task splitting, introducing conditional constraints and image-to-image translation, which effectively improved the quality of the synthesized image, increased the resolution, and allowed more manipulation across the image synthesis process. However, there are some challenges as mentioned below: 1) Generate high-quality, high-resolution, and diverse images under large-scale complex data sets. 2) The manipulation of synthesized image attributes at different levels and different granularities. 3) The lack of paired training data and the guarantee of image translation quality and diversity. GAN application study in data augmentation, modality migration, image segmentation, and denoising of medical images has been widely analyzed. 1) The network model based on the Pix2pix basic framework can synthesize additional high-quality and high-resolution samples and improve the segmentation and classification performance based on data augmentation effectively. However, there are still problems such as insufficient synthetic sample diversity, basic biological structures maintenance difficulty, and limited 3D image synthesis capabilities. 2) The network model based on the CycleGAN basic framework does not require paired training images. It has been extensively analyzed in modality migration, but may lose the basic structure information. The current research on structure preservation in modality migration limits in the fusion of information, such as edges and segmentation. 3) Both the generator and the discriminator can be fused with the current segmentation model to improve the performance of the segmentation model. The generator can synthesize additional data, and the discriminator can guide model training from a high-level semantic level and make full use of unlabeled data. However, current research mainly focuses on single-modality image segmentation. 4) GAN application in image denoising can reconstruct normal-dose images from low-dose images, reducing the radiation impact suffered by patients. The critical issues of GAN in medical image processing are presented as follows: 1) Most medical image data is three-dimensional, such as MRI (magnetic resonance imaging) and CT (computed tomography), etc. The improvement of the synthesis quality and resolution of the three-dimensional data is a critical issue. 2) The difficulty in ensuring the diversity of synthesized data while keeping its basic geometric structure's rationality. 3) The question on how to make full use of unlabeled and unpaired data to generate high-quality, high-resolution, and diverse images. 4) The improvement of algorithms' cross-modality generalization performance, and the effective migration of different modality data. Future research should focus on the issues as following: 1) To optimize network architecture, objective function, and training methods for 3D data synthesis, improving model training stability, quality, resolution, and diversity of 3D synthesized images. 2) To further promote the prior geometric knowledge integration with GAN. 3) To take full advantage of the GAN's weak supervision characteristics. 4) To extract invariant features via attribute decoupling for good generalization performance and achieve attribute control at different levels, granularities, and needs in the process of modality migration. To conclude, ever since GAN was proposed, its theory has been continuously improved. A considerable evolution in medical image applications has been sorted out, such as data augmentation, modality migration, image segmentation, and denoising. Some challenging issues are still waiting to be resolved, including three-dimensional data synthesis, geometric structure rationality maintenance, unlabeled and unpaired data usage, and multi-modality data application.

Translated title of the contributionA review of generative adversarial networks and the application in medical image
Original languageChinese (Traditional)
Pages (from-to)687-703
Number of pages17
JournalJournal of Image and Graphics
Volume27
Issue number3
DOIs
Publication statusPublished - 16 Mar 2022
Externally publishedYes

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Graphics and Computer-Aided Design
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'A review of generative adversarial networks and the application in medical image'. Together they form a unique fingerprint.

Cite this