DeepText: Detecting text from the wild with multi-ASPP-assembled deeplab

Qingqing Wang; Wenjing Jia; Xiangjian He; Yue Lu; Michael Blumenstein; Ye Huang; Shujing Lyu

doi:10.1109/ICDAR.2019.00042

DeepText: Detecting text from the wild with multi-ASPP-assembled deeplab

Qingqing Wang, Wenjing Jia, Xiangjian He, Yue Lu, Michael Blumenstein, Ye Huang, Shujing Lyu

Research output: Chapter in Book/Conference proceeding › Conference contribution › peer-review

1 Citation (Scopus)

Abstract

In this paper, we address the issue of scene text detection in the way of direct regression and successfully adapt an effective semantic segmentation model, DeepLab v3+ [1], for this application. In order to handle texts with arbitrary orientations and sizes and improve the recall of small texts, we propose to extract features of multiple scales by inserting multiple Atrous Spatial Pyramid Pooling (ASPP) layers to the DeepLab after the feature maps with different resolutions. Then, we set multiple auxiliary IoU losses at the decoding stage and make auxiliary connections from the intermediate encoding layers to the decoder to assist network training and enhance the discrimination ability of lower encoding layers. Experiments conducted on the benchmark scene text dataset ICDAR2015 demonstrate the superior performance of our proposed network, named as DeepText, over the state-of-the-art approaches.

Original language	English
Title of host publication	Proceedings - 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019
Publisher	IEEE Computer Society
Pages	208-213
Number of pages	6
ISBN (Electronic)	9781728128610
DOIs	https://doi.org/10.1109/ICDAR.2019.00042
Publication status	Published - Sept 2019
Externally published	Yes
Event	15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019 - Sydney, Australia Duration: 20 Sept 2019 → 25 Sept 2019

Publication series

Name	Proceedings of the International Conference on Document Analysis and Recognition, ICDAR
ISSN (Print)	1520-5363

Conference

Conference	15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019
Country/Territory	Australia
City	Sydney
Period	20/09/19 → 25/09/19

Keywords

Auxiliary IoU losses
Auxiliary connections
DeepLab
Multiple ASPP
Scene text detection

ASJC Scopus subject areas

Computer Vision and Pattern Recognition

Access to Document

10.1109/ICDAR.2019.00042

Cite this

Wang, Q., Jia, W., He, X., Lu, Y., Blumenstein, M., Huang, Y., & Lyu, S. (2019). DeepText: Detecting text from the wild with multi-ASPP-assembled deeplab. In Proceedings - 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019 (pp. 208-213). Article 8978048 (Proceedings of the International Conference on Document Analysis and Recognition, ICDAR). IEEE Computer Society. https://doi.org/10.1109/ICDAR.2019.00042

@inproceedings{f6dc0be851b0405199e32063d334d39d,

title = "DeepText: Detecting text from the wild with multi-ASPP-assembled deeplab",

abstract = "In this paper, we address the issue of scene text detection in the way of direct regression and successfully adapt an effective semantic segmentation model, DeepLab v3+ [1], for this application. In order to handle texts with arbitrary orientations and sizes and improve the recall of small texts, we propose to extract features of multiple scales by inserting multiple Atrous Spatial Pyramid Pooling (ASPP) layers to the DeepLab after the feature maps with different resolutions. Then, we set multiple auxiliary IoU losses at the decoding stage and make auxiliary connections from the intermediate encoding layers to the decoder to assist network training and enhance the discrimination ability of lower encoding layers. Experiments conducted on the benchmark scene text dataset ICDAR2015 demonstrate the superior performance of our proposed network, named as DeepText, over the state-of-the-art approaches.",

keywords = "Auxiliary IoU losses, Auxiliary connections, DeepLab, Multiple ASPP, Scene text detection",

author = "Qingqing Wang and Wenjing Jia and Xiangjian He and Yue Lu and Michael Blumenstein and Ye Huang and Shujing Lyu",

note = "Publisher Copyright: {\textcopyright} 2019 IEEE.; 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019 ; Conference date: 20-09-2019 Through 25-09-2019",

year = "2019",

month = sep,

doi = "10.1109/ICDAR.2019.00042",

language = "English",

series = "Proceedings of the International Conference on Document Analysis and Recognition, ICDAR",

publisher = "IEEE Computer Society",

pages = "208--213",

booktitle = "Proceedings - 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019",

address = "United States",

}

Wang, Q, Jia, W, He, X, Lu, Y, Blumenstein, M, Huang, Y & Lyu, S 2019, DeepText: Detecting text from the wild with multi-ASPP-assembled deeplab. in Proceedings - 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019., 8978048, Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, IEEE Computer Society, pp. 208-213, 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019, Sydney, Australia, 20/09/19. https://doi.org/10.1109/ICDAR.2019.00042

DeepText: Detecting text from the wild with multi-ASPP-assembled deeplab. / Wang, Qingqing; Jia, Wenjing; He, Xiangjian et al.
Proceedings - 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019. IEEE Computer Society, 2019. p. 208-213 8978048 (Proceedings of the International Conference on Document Analysis and Recognition, ICDAR).

Research output: Chapter in Book/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - DeepText

T2 - 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019

AU - Wang, Qingqing

AU - Jia, Wenjing

AU - He, Xiangjian

AU - Lu, Yue

AU - Blumenstein, Michael

AU - Huang, Ye

AU - Lyu, Shujing

PY - 2019/9

Y1 - 2019/9

N2 - In this paper, we address the issue of scene text detection in the way of direct regression and successfully adapt an effective semantic segmentation model, DeepLab v3+ [1], for this application. In order to handle texts with arbitrary orientations and sizes and improve the recall of small texts, we propose to extract features of multiple scales by inserting multiple Atrous Spatial Pyramid Pooling (ASPP) layers to the DeepLab after the feature maps with different resolutions. Then, we set multiple auxiliary IoU losses at the decoding stage and make auxiliary connections from the intermediate encoding layers to the decoder to assist network training and enhance the discrimination ability of lower encoding layers. Experiments conducted on the benchmark scene text dataset ICDAR2015 demonstrate the superior performance of our proposed network, named as DeepText, over the state-of-the-art approaches.

AB - In this paper, we address the issue of scene text detection in the way of direct regression and successfully adapt an effective semantic segmentation model, DeepLab v3+ [1], for this application. In order to handle texts with arbitrary orientations and sizes and improve the recall of small texts, we propose to extract features of multiple scales by inserting multiple Atrous Spatial Pyramid Pooling (ASPP) layers to the DeepLab after the feature maps with different resolutions. Then, we set multiple auxiliary IoU losses at the decoding stage and make auxiliary connections from the intermediate encoding layers to the decoder to assist network training and enhance the discrimination ability of lower encoding layers. Experiments conducted on the benchmark scene text dataset ICDAR2015 demonstrate the superior performance of our proposed network, named as DeepText, over the state-of-the-art approaches.

KW - Auxiliary IoU losses

KW - Auxiliary connections

KW - DeepLab

KW - Multiple ASPP

KW - Scene text detection

UR - http://www.scopus.com/inward/record.url?scp=85079890524&partnerID=8YFLogxK

U2 - 10.1109/ICDAR.2019.00042

DO - 10.1109/ICDAR.2019.00042

M3 - Conference contribution

AN - SCOPUS:85079890524

T3 - Proceedings of the International Conference on Document Analysis and Recognition, ICDAR

SP - 208

EP - 213

BT - Proceedings - 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019

PB - IEEE Computer Society

Y2 - 20 September 2019 through 25 September 2019

ER -

Wang Q, Jia W, He X, Lu Y, Blumenstein M, Huang Y et al. DeepText: Detecting text from the wild with multi-ASPP-assembled deeplab. In Proceedings - 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019. IEEE Computer Society. 2019. p. 208-213. 8978048. (Proceedings of the International Conference on Document Analysis and Recognition, ICDAR). doi: 10.1109/ICDAR.2019.00042

DeepText: Detecting text from the wild with multi-ASPP-assembled deeplab

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this