Deformity Removal from Handwritten Text Documents using Variable CycleGAN

Nigam, Shivangi; Behera, Adarsh Prasad; Sherma, Shekhar; Nagabhushan, P.

doi:10.1007/s10032-024-00466-x

Ficheros

Main Article (1.397Mb)

Identificadores

URI: https://hdl.handle.net/20.500.12761/1820

ISSN: 1433-2825

DOI: 10.1007/s10032-024-00466-x

Metadatos

Mostrar el registro completo del ítem

Autor(es)

Nigam, Shivangi; Behera, Adarsh Prasad; Sherma, Shekhar; Nagabhushan, P.

Fecha

2024-05-07

Resumen

Text recognition systems typically work well for printed documents but struggle with handwritten documents due to different writing styles, background complexities, added noise of image acquisition methods, and deformed text images such as strikeoffs and underlines. These deformities change the structural information, making it difficult to restore the deformed images while maintaining the structural information and preserving the semantic dependencies of the local pixels. Current adversarial networks are unable to preserve the structural and semantic dependencies as they focus on individual pixel-to-pixel variation and encourage non-meaningful aspects of the images. To address this, we propose a Variable Cycle Generative Adversarial Network (VCGAN) that considers the perceptual quality of the images. By using a variable Content Loss (Top-k Variable Loss (TVk) ), VCGAN preserves the inter-dependence of spatially close pixels while removing the strike-off strokes. The similarity of the images is computed with TVk considering the intensity variations that do not interfere with the semantic structures of the image. Our results show that VCGAN can remove most deformities with an elevated F1 score of 97.40% and outperforms current state-of-the-art algorithms with a character error rate of 7.64% and word accuracy of 81.53% when tested on the handwritten text recognition system.