チラシ画像からの商品情報自動抽出: ―内容情報認識―

Misaki Shibayama, Masanobu Takahashi

研究成果: Article査読

抄録

The purpose of this study is to automatically recognize goods information in leaflets images and to create a database in order to record and refer to leaflets information. Leaflet information is divided into the content information (company name, goods name and content) and the price information of the goods. We aimed to realize a function to recognize the content information, which has not been realized yet. In order to recognize the content information, it is necessary to recognize characters in a complex background. Therefore, characters were recognized using the OCR function of Google Cloud Vision API. In order to correct misrecognitions automatically and to recognize the content information, we realized the recognition of character color and background color, the correction of coordinates using these colors, the correction of misspellings using our own goods information database, and the separation of company name, goods name, and content amount. In the experiment, we used 154 pieces of content information, which consisted of a company name, a goods name and a content amount. Although about half of the content information contained misrecognition, 92.9% of the content information was recognized correctly. This method was shown to be effective as a recognition method of content information.
寄稿の翻訳タイトルAutomatic recognition of goods information in leaflets: -Content information recognition-
本文言語Japanese
ページ(範囲)32
ページ数40
ジャーナルJournal of the Japan Personal Computer Application Technology Society
15
1
DOI
出版ステータスPublished - 2021 3月 27

フィンガープリント

「チラシ画像からの商品情報自動抽出: ―内容情報認識―」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル