Arbitrary-Shape Scene Text Detection via Visual-Relational Rectification and Contour Approximation

Chengpei Xu, Wenjing Jia, Tingcheng Cui, Ruomei Wang, Yuan Fang Zhang, Xiangjian He

Research output: Journal PublicationArticlepeer-review

1 Citation (Scopus)

Abstract

One trend in the latest bottom-up approaches for arbitrary-shape scene text detection is to determine the links between text segments using Graph Convolutional Networks (GCNs). However, the performance of these bottom-up methods is still inferior to that of state-of-the-art top-down methods even with the help of GCNs. We argue that a cause of this is that bottom-up methods fail to make proper use of visual-relational features, which results in accumulated false detection, as well as the error-prone route-finding used for grouping text segments. In this paper, we improve classic bottom-up text detection frameworks by fusing the visual-relational features of text with two effective false positive/negative suppression (FPNS) mechanisms and developing a new shape-approximation strategy. First, dense overlapping text segments depicting the 'characterness' and 'streamline' properties of text are constructed and used in weakly supervised node classification to filter the falsely detected text segments. Then, relational features and visual features of text segments are fused with a novel Location-Aware Transfer (LAT) module and Fuse Decoding (FD) module to jointly rectify the detected text segments. Finally, a novel multiple-text-map-aware contour-approximation strategy is developed based on the rectified text segments, instead of the error-prone route-finding process, to generate the final contour of the detected text. Experiments conducted on five benchmark datasets demonstrate that our method outperforms the state-of-the-art performance when embedded in a classic text detection framework, which revitalizes the strengths of bottom-up methods.

Original languageEnglish
Pages (from-to)4052-4066
Number of pages15
JournalIEEE Transactions on Multimedia
Volume25
DOIs
Publication statusPublished - 2022

Keywords

  • Arbitrary-shape scene text detection
  • bottom-up method
  • false positive/negative suppression
  • relational reasoning

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Media Technology
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Arbitrary-Shape Scene Text Detection via Visual-Relational Rectification and Contour Approximation'. Together they form a unique fingerprint.

Cite this