Experimental application of a Japanese historical document image synthesis method to text line segmentation

Naoto Inuzuka, Tetsuya Suzuki

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We plan to use a text line segmentation method based on machine learning in our transcription support system for handwritten Japanese historical document in Kana, and are searching for a data synthesis method of annotated document images because it is time consuming to manually annotate a large set of document images for training data for machine learning. In this paper, we report our synthesis method of annotated document images designed for a Japanese historical document. To compare manually annotated Japanese historical document images and annotated document images synthesized by the method as training data for an object detection algorithm YOLOv3, we conducted text line segmentation experiments using the object detection algorithm. The experimental results show that a model trained by the synthetic annotated document images are competitive with that trained by the manually annotated document images from the view point of a metric intersection-over-union.

Original languageEnglish
Title of host publicationICPRAM 2021 - Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods
EditorsMaria De Marsico, Gabriella Sanniti di Baja, Ana Fred
PublisherSciTePress
Pages628-634
Number of pages7
ISBN (Electronic)9789897584862
Publication statusPublished - 2021
Event10th International Conference on Pattern Recognition Applications and Methods, ICPRAM 2021 - Virtual, Online
Duration: 2021 Feb 42021 Feb 6

Publication series

NameICPRAM 2021 - Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods

Conference

Conference10th International Conference on Pattern Recognition Applications and Methods, ICPRAM 2021
CityVirtual, Online
Period21/2/421/2/6

Keywords

  • Data synthesis
  • Deep learning
  • Historical document
  • Text line segmentation

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition

Fingerprint Dive into the research topics of 'Experimental application of a Japanese historical document image synthesis method to text line segmentation'. Together they form a unique fingerprint.

Cite this