生成对抗式网络及其医学影像应用研究综述

Yinglin Zhang; Yan Hu; Risa Higashita; Jiang Liu

doi:10.11834/jig.210247

生成对抗式网络及其医学影像应用研究综述

Translated title of the contribution: A review of generative adversarial networks and the application in medical image

Yinglin Zhang, Yan Hu, Risa Higashita, Jiang Liu

Research output: Journal Publication › Review article › peer-review

5 Citations (Scopus)

Abstract

The generative adversarial network (GAN) consists of a generator based on the data distribution learning and an identified sample's authenticity discriminator. They learn from each other gradually in the process of confrontation. The network enables the deep learning method to learn the loss function automatically and reduces expertise dependence. It has been widely used in natural image processing, and it is also a promising solution for related problems in medical image processing field. This paper aims to bridge the gap between GAN and specific medical field problems and point out the future improvement directions. First, the basic principle of GAN is issued. Secondly, we review the latest medical images research on data augmentation, modality migration, image segmentation, and denoising; analyze the advantage and disadvantage of each method and the scope of application. Next, the current quality assessment is summarized. At the end, the research development, issue, and future improvement direction of GAN on medical image are summarized. GAN theoretical study focus on three aspects of task splitting, introducing conditional constraints and image-to-image translation, which effectively improved the quality of the synthesized image, increased the resolution, and allowed more manipulation across the image synthesis process. However, there are some challenges as mentioned below: 1) Generate high-quality, high-resolution, and diverse images under large-scale complex data sets. 2) The manipulation of synthesized image attributes at different levels and different granularities. 3) The lack of paired training data and the guarantee of image translation quality and diversity. GAN application study in data augmentation, modality migration, image segmentation, and denoising of medical images has been widely analyzed. 1) The network model based on the Pix2pix basic framework can synthesize additional high-quality and high-resolution samples and improve the segmentation and classification performance based on data augmentation effectively. However, there are still problems such as insufficient synthetic sample diversity, basic biological structures maintenance difficulty, and limited 3D image synthesis capabilities. 2) The network model based on the CycleGAN basic framework does not require paired training images. It has been extensively analyzed in modality migration, but may lose the basic structure information. The current research on structure preservation in modality migration limits in the fusion of information, such as edges and segmentation. 3) Both the generator and the discriminator can be fused with the current segmentation model to improve the performance of the segmentation model. The generator can synthesize additional data, and the discriminator can guide model training from a high-level semantic level and make full use of unlabeled data. However, current research mainly focuses on single-modality image segmentation. 4) GAN application in image denoising can reconstruct normal-dose images from low-dose images, reducing the radiation impact suffered by patients. The critical issues of GAN in medical image processing are presented as follows: 1) Most medical image data is three-dimensional, such as MRI (magnetic resonance imaging) and CT (computed tomography), etc. The improvement of the synthesis quality and resolution of the three-dimensional data is a critical issue. 2) The difficulty in ensuring the diversity of synthesized data while keeping its basic geometric structure's rationality. 3) The question on how to make full use of unlabeled and unpaired data to generate high-quality, high-resolution, and diverse images. 4) The improvement of algorithms' cross-modality generalization performance, and the effective migration of different modality data. Future research should focus on the issues as following: 1) To optimize network architecture, objective function, and training methods for 3D data synthesis, improving model training stability, quality, resolution, and diversity of 3D synthesized images. 2) To further promote the prior geometric knowledge integration with GAN. 3) To take full advantage of the GAN's weak supervision characteristics. 4) To extract invariant features via attribute decoupling for good generalization performance and achieve attribute control at different levels, granularities, and needs in the process of modality migration. To conclude, ever since GAN was proposed, its theory has been continuously improved. A considerable evolution in medical image applications has been sorted out, such as data augmentation, modality migration, image segmentation, and denoising. Some challenging issues are still waiting to be resolved, including three-dimensional data synthesis, geometric structure rationality maintenance, unlabeled and unpaired data usage, and multi-modality data application.

Translated title of the contribution	A review of generative adversarial networks and the application in medical image
Original language	Chinese (Traditional)
Pages (from-to)	687-703
Number of pages	17
Journal	Journal of Image and Graphics
Volume	27
Issue number	3
DOIs	https://doi.org/10.11834/jig.210247
Publication status	Published - 16 Mar 2022
Externally published	Yes

ASJC Scopus subject areas

Human-Computer Interaction
Computer Vision and Pattern Recognition
Computer Graphics and Computer-Aided Design
Artificial Intelligence

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.11834/jig.210247

Cite this

@article{98186a3870764ee2b02eb1e92f842cd2,

title = "生成对抗式网络及其医学影像应用研究综述",

abstract = "The generative adversarial network (GAN) consists of a generator based on the data distribution learning and an identified sample's authenticity discriminator. They learn from each other gradually in the process of confrontation. The network enables the deep learning method to learn the loss function automatically and reduces expertise dependence. It has been widely used in natural image processing, and it is also a promising solution for related problems in medical image processing field. This paper aims to bridge the gap between GAN and specific medical field problems and point out the future improvement directions. First, the basic principle of GAN is issued. Secondly, we review the latest medical images research on data augmentation, modality migration, image segmentation, and denoising; analyze the advantage and disadvantage of each method and the scope of application. Next, the current quality assessment is summarized. At the end, the research development, issue, and future improvement direction of GAN on medical image are summarized. GAN theoretical study focus on three aspects of task splitting, introducing conditional constraints and image-to-image translation, which effectively improved the quality of the synthesized image, increased the resolution, and allowed more manipulation across the image synthesis process. However, there are some challenges as mentioned below: 1) Generate high-quality, high-resolution, and diverse images under large-scale complex data sets. 2) The manipulation of synthesized image attributes at different levels and different granularities. 3) The lack of paired training data and the guarantee of image translation quality and diversity. GAN application study in data augmentation, modality migration, image segmentation, and denoising of medical images has been widely analyzed. 1) The network model based on the Pix2pix basic framework can synthesize additional high-quality and high-resolution samples and improve the segmentation and classification performance based on data augmentation effectively. However, there are still problems such as insufficient synthetic sample diversity, basic biological structures maintenance difficulty, and limited 3D image synthesis capabilities. 2) The network model based on the CycleGAN basic framework does not require paired training images. It has been extensively analyzed in modality migration, but may lose the basic structure information. The current research on structure preservation in modality migration limits in the fusion of information, such as edges and segmentation. 3) Both the generator and the discriminator can be fused with the current segmentation model to improve the performance of the segmentation model. The generator can synthesize additional data, and the discriminator can guide model training from a high-level semantic level and make full use of unlabeled data. However, current research mainly focuses on single-modality image segmentation. 4) GAN application in image denoising can reconstruct normal-dose images from low-dose images, reducing the radiation impact suffered by patients. The critical issues of GAN in medical image processing are presented as follows: 1) Most medical image data is three-dimensional, such as MRI (magnetic resonance imaging) and CT (computed tomography), etc. The improvement of the synthesis quality and resolution of the three-dimensional data is a critical issue. 2) The difficulty in ensuring the diversity of synthesized data while keeping its basic geometric structure's rationality. 3) The question on how to make full use of unlabeled and unpaired data to generate high-quality, high-resolution, and diverse images. 4) The improvement of algorithms' cross-modality generalization performance, and the effective migration of different modality data. Future research should focus on the issues as following: 1) To optimize network architecture, objective function, and training methods for 3D data synthesis, improving model training stability, quality, resolution, and diversity of 3D synthesized images. 2) To further promote the prior geometric knowledge integration with GAN. 3) To take full advantage of the GAN's weak supervision characteristics. 4) To extract invariant features via attribute decoupling for good generalization performance and achieve attribute control at different levels, granularities, and needs in the process of modality migration. To conclude, ever since GAN was proposed, its theory has been continuously improved. A considerable evolution in medical image applications has been sorted out, such as data augmentation, modality migration, image segmentation, and denoising. Some challenging issues are still waiting to be resolved, including three-dimensional data synthesis, geometric structure rationality maintenance, unlabeled and unpaired data usage, and multi-modality data application.",

keywords = "Data augmentation, Deep learning, Generative adversarial network (GAN), Image denoising, Image segmentation, Medical image, Modality migration",

author = "Yinglin Zhang and Yan Hu and Risa Higashita and Jiang Liu",

year = "2022",

month = mar,

day = "16",

doi = "10.11834/jig.210247",

language = "简体中文",

volume = "27",

pages = "687--703",

journal = "Journal of Image and Graphics",

issn = "1006-8961",

publisher = "Editorial and Publishing Board of JIG",

number = "3",

}

TY - JOUR

T1 - 生成对抗式网络及其医学影像应用研究综述

AU - Zhang, Yinglin

AU - Hu, Yan

AU - Higashita, Risa

AU - Liu, Jiang

PY - 2022/3/16

Y1 - 2022/3/16

N2 - The generative adversarial network (GAN) consists of a generator based on the data distribution learning and an identified sample's authenticity discriminator. They learn from each other gradually in the process of confrontation. The network enables the deep learning method to learn the loss function automatically and reduces expertise dependence. It has been widely used in natural image processing, and it is also a promising solution for related problems in medical image processing field. This paper aims to bridge the gap between GAN and specific medical field problems and point out the future improvement directions. First, the basic principle of GAN is issued. Secondly, we review the latest medical images research on data augmentation, modality migration, image segmentation, and denoising; analyze the advantage and disadvantage of each method and the scope of application. Next, the current quality assessment is summarized. At the end, the research development, issue, and future improvement direction of GAN on medical image are summarized. GAN theoretical study focus on three aspects of task splitting, introducing conditional constraints and image-to-image translation, which effectively improved the quality of the synthesized image, increased the resolution, and allowed more manipulation across the image synthesis process. However, there are some challenges as mentioned below: 1) Generate high-quality, high-resolution, and diverse images under large-scale complex data sets. 2) The manipulation of synthesized image attributes at different levels and different granularities. 3) The lack of paired training data and the guarantee of image translation quality and diversity. GAN application study in data augmentation, modality migration, image segmentation, and denoising of medical images has been widely analyzed. 1) The network model based on the Pix2pix basic framework can synthesize additional high-quality and high-resolution samples and improve the segmentation and classification performance based on data augmentation effectively. However, there are still problems such as insufficient synthetic sample diversity, basic biological structures maintenance difficulty, and limited 3D image synthesis capabilities. 2) The network model based on the CycleGAN basic framework does not require paired training images. It has been extensively analyzed in modality migration, but may lose the basic structure information. The current research on structure preservation in modality migration limits in the fusion of information, such as edges and segmentation. 3) Both the generator and the discriminator can be fused with the current segmentation model to improve the performance of the segmentation model. The generator can synthesize additional data, and the discriminator can guide model training from a high-level semantic level and make full use of unlabeled data. However, current research mainly focuses on single-modality image segmentation. 4) GAN application in image denoising can reconstruct normal-dose images from low-dose images, reducing the radiation impact suffered by patients. The critical issues of GAN in medical image processing are presented as follows: 1) Most medical image data is three-dimensional, such as MRI (magnetic resonance imaging) and CT (computed tomography), etc. The improvement of the synthesis quality and resolution of the three-dimensional data is a critical issue. 2) The difficulty in ensuring the diversity of synthesized data while keeping its basic geometric structure's rationality. 3) The question on how to make full use of unlabeled and unpaired data to generate high-quality, high-resolution, and diverse images. 4) The improvement of algorithms' cross-modality generalization performance, and the effective migration of different modality data. Future research should focus on the issues as following: 1) To optimize network architecture, objective function, and training methods for 3D data synthesis, improving model training stability, quality, resolution, and diversity of 3D synthesized images. 2) To further promote the prior geometric knowledge integration with GAN. 3) To take full advantage of the GAN's weak supervision characteristics. 4) To extract invariant features via attribute decoupling for good generalization performance and achieve attribute control at different levels, granularities, and needs in the process of modality migration. To conclude, ever since GAN was proposed, its theory has been continuously improved. A considerable evolution in medical image applications has been sorted out, such as data augmentation, modality migration, image segmentation, and denoising. Some challenging issues are still waiting to be resolved, including three-dimensional data synthesis, geometric structure rationality maintenance, unlabeled and unpaired data usage, and multi-modality data application.

AB - The generative adversarial network (GAN) consists of a generator based on the data distribution learning and an identified sample's authenticity discriminator. They learn from each other gradually in the process of confrontation. The network enables the deep learning method to learn the loss function automatically and reduces expertise dependence. It has been widely used in natural image processing, and it is also a promising solution for related problems in medical image processing field. This paper aims to bridge the gap between GAN and specific medical field problems and point out the future improvement directions. First, the basic principle of GAN is issued. Secondly, we review the latest medical images research on data augmentation, modality migration, image segmentation, and denoising; analyze the advantage and disadvantage of each method and the scope of application. Next, the current quality assessment is summarized. At the end, the research development, issue, and future improvement direction of GAN on medical image are summarized. GAN theoretical study focus on three aspects of task splitting, introducing conditional constraints and image-to-image translation, which effectively improved the quality of the synthesized image, increased the resolution, and allowed more manipulation across the image synthesis process. However, there are some challenges as mentioned below: 1) Generate high-quality, high-resolution, and diverse images under large-scale complex data sets. 2) The manipulation of synthesized image attributes at different levels and different granularities. 3) The lack of paired training data and the guarantee of image translation quality and diversity. GAN application study in data augmentation, modality migration, image segmentation, and denoising of medical images has been widely analyzed. 1) The network model based on the Pix2pix basic framework can synthesize additional high-quality and high-resolution samples and improve the segmentation and classification performance based on data augmentation effectively. However, there are still problems such as insufficient synthetic sample diversity, basic biological structures maintenance difficulty, and limited 3D image synthesis capabilities. 2) The network model based on the CycleGAN basic framework does not require paired training images. It has been extensively analyzed in modality migration, but may lose the basic structure information. The current research on structure preservation in modality migration limits in the fusion of information, such as edges and segmentation. 3) Both the generator and the discriminator can be fused with the current segmentation model to improve the performance of the segmentation model. The generator can synthesize additional data, and the discriminator can guide model training from a high-level semantic level and make full use of unlabeled data. However, current research mainly focuses on single-modality image segmentation. 4) GAN application in image denoising can reconstruct normal-dose images from low-dose images, reducing the radiation impact suffered by patients. The critical issues of GAN in medical image processing are presented as follows: 1) Most medical image data is three-dimensional, such as MRI (magnetic resonance imaging) and CT (computed tomography), etc. The improvement of the synthesis quality and resolution of the three-dimensional data is a critical issue. 2) The difficulty in ensuring the diversity of synthesized data while keeping its basic geometric structure's rationality. 3) The question on how to make full use of unlabeled and unpaired data to generate high-quality, high-resolution, and diverse images. 4) The improvement of algorithms' cross-modality generalization performance, and the effective migration of different modality data. Future research should focus on the issues as following: 1) To optimize network architecture, objective function, and training methods for 3D data synthesis, improving model training stability, quality, resolution, and diversity of 3D synthesized images. 2) To further promote the prior geometric knowledge integration with GAN. 3) To take full advantage of the GAN's weak supervision characteristics. 4) To extract invariant features via attribute decoupling for good generalization performance and achieve attribute control at different levels, granularities, and needs in the process of modality migration. To conclude, ever since GAN was proposed, its theory has been continuously improved. A considerable evolution in medical image applications has been sorted out, such as data augmentation, modality migration, image segmentation, and denoising. Some challenging issues are still waiting to be resolved, including three-dimensional data synthesis, geometric structure rationality maintenance, unlabeled and unpaired data usage, and multi-modality data application.

KW - Data augmentation

KW - Deep learning

KW - Generative adversarial network (GAN)

KW - Image denoising

KW - Image segmentation

KW - Medical image

KW - Modality migration

UR - http://www.scopus.com/inward/record.url?scp=85127512230&partnerID=8YFLogxK

U2 - 10.11834/jig.210247

DO - 10.11834/jig.210247

M3 - 文献综述

AN - SCOPUS:85127512230

SN - 1006-8961

VL - 27

SP - 687

EP - 703

JO - Journal of Image and Graphics

JF - Journal of Image and Graphics

IS - 3

ER -

生成对抗式网络及其医学影像应用研究综述

Abstract

ASJC Scopus subject areas

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this