期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2018
卷号:96
期号:13
出版社:Journal of Theoretical and Applied
摘要:Optical Character Recognition (OCR) is defined as the process of segregating textual scripts from a scanned document. To develop a digitally empowered society, information is made available in digital form. The OCR software assists in digitization of documents in different languages. Many researches are working on digitization of documents particularly to develop effective and error free character recognition models. To develop a digitally empowered society, information should be made digitally available. There arises the need for an OCR software in different languages. Malayalam handwritten character recognition precision is still inhibited around 90% due to the confrontations in Malayalam character set. The omnipresence of two different scripts old and new script, huge character set, ubiquity of similar shaped characters makes Malayalam handwritten character recognition more difficult. Feature extraction for each language may vary depending on various characteristics of the language. By observing the shape patterns in each language, different novel methods are developed to extract features and also to recognize the same. In this research, a novel hybrid approach is proposed which uses a combination of statistical and structural features (SSF). The statistical features are those derived from the statistical dissipating of pixels. Structural features are based on the topological and geometrical properties of the character. This study gives insight to the fact that combination of statistical and structural features gives more accuracy in Malayalam character recognition.