文章基本信息

标题：A Two Stage Method for Bengali Text Extraction from Still Images Containing Text
本地全文：下载
作者：Ankita Sikdar ; Payal Roy ; Somdeep Mukherjee 等
期刊名称：Computer Science & Information Technology
电子版ISSN：2231-5403
出版年度：2012
卷号：2
期号：3
页码：47-55
DOI：10.5121/csit.2012.2306
出版社：Academy & Industry Research Collaboration Center (AIRCC)
摘要：Bengali text data present in multimedia images having multiple content forms, such as still images and text, contain information that when extracted finds a lot of applications. The images can be of different types, where objects and text may be completely separated or overlapped or embedded in each other. The Bengali text can be of different shapes and sizes. Extraction of text from these types of images becomes challenging because the textual portion has to be correctly separated from the rest of the background. The input image passes through two stages. The first step tries to locate the different components in the image using entropy filtering and the second stage distinguishes the components representing text from the non-textual components based on several features of Bengali text. The text thus obtained from the image can then be used in software such as Bengali OCR for character recognition.
关键词：Bengali character feature identification;Connected components; Entropy filtering & Text ;extraction