首页    期刊浏览 2024年12月11日 星期三
登录注册

文章基本信息

  • 标题:ANALYSIS OF CANONICAL CHARACTER SEGMENTATION TECHNIQUE FOR ANCIENT TELUGU TEXT DOCUMENTS
  • 本地全文:下载
  • 作者:N. VENKATA RAO ; Dr. A.S.C.S.SASTRY ; A.S.N.CHAKRAVARTHY
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2015
  • 卷号:82
  • 期号:2
  • 出版社:Journal of Theoretical and Applied
  • 摘要:Character Recognition in ancient document images remains a challenging task. Initial scanning process deforms the document image, while aging process of document render it ancient which turns it to posses unwanted background noise. Segmentation includes an essential process in OCR. Complex scripts like derivatives of Brahmi, encounter various problems in the segmentation process. A hybrid model that entails segmentation in noisy images followed by binarization is proposed. In the first phase, segmentation technique for the ancient Telugu document image into meaningful units is proposed. Horizontal profile pattern is convolved with Gaussian kernel. The statistical properties of meaningful units are explored through an extensive analysis of the geometrical patterns of meaningful units. In the second phase, noisy documents are cleaned with the help of Modified IGT algorithm and then segmented by using conventional profile mechanism. The performance of the present hybrid technique is proved by the results of higher efficiencies for the cleaned documents. The efficiency analysis of segmentation carried out for the present hybrid technique reveals a threshold number of Vowels (V), Consonants(C), CV core characters to exhibit higher efficiencies. It also reflects upon the non-canonical features of any other marks of the Telugu document.
  • 关键词:Segmentation; Profile; Gaussian derivative kernel; Modified IGT; Error Rate
国家哲学社会科学文献中心版权所有