Int J Performability Eng ›› 2026, Vol. 22 ›› Issue (4): 200-208.doi: 10.23940/ijpe.26.04.p3.200208

Previous Articles     Next Articles

Attention-Guided Adaptive Feature Pyramid Network with Fuzzy Edge Refinement for Robust Scene Text Segmentation

Rajeswari Reddy Patila,b,* and Aradhana Dc   

  1. aVisvesvaraya Technological University, Karnataka, India,
    bDepartment of Computer Science and Engineering, Rao Bahadur Y Mahabaleswarappa Engineering College, Karnataka, India
    cDepartment of Computer Science and Engineering, Ballari Institute of Technology and Management, Ballari, India
  • Submitted on ; Revised on ; Accepted on
  • Contact: * E-mail address: rajeswarirp@rymec.in

Abstract: Detecting and understanding visual semantics within natural scenes constitutes a pivotal research domain in pattern recognition and text analysis. Visual semantics involves understanding the pattern and layout of text in scene images. Text detection presents considerable challenges due to variations in size, color, font style, complex backgrounds, and low brightness levels. Although deep learning frameworks have significantly better performance over conventional techniques, the issue of text presented in arbitrary orientations amidst intricate backgrounds persists. This paper introduces a hybrid text segmentation method that tackles these issues by employing deep learning techniques. We propose a segmentation method that extracts features at multiple resolutions using VGG19 encoder unified with Attention Guided Adaptive Feature Pyramid Network (AG-AFPN) that captures richer multiscale representation. Furthermore, a Type-2 fuzzy logic approach serves to refine the edge map, thereby improving the accuracy of text boundary segmentation. For post-processing, Differential Binarization (DB) is applied to generate a precise binary mask from the network's output, thus enhancing segmentation performance by effectively managing variations in arbitrary text and cluttered backgrounds. The proposed method is assessed using multiple benchmark datasets, such as ICDAR 2013, ICDAR 2015, and Total Text.

Key words: scene text detection, text segmentation, VGG19, attention guided adaptive feature pyramid network, type-2 fuzzy logic, differential binarization