期刊名称:International Journal of Computer Science and Information Technologies
电子版ISSN:0975-9646
出版年度:2011
卷号:2
期号:1
页码:494-503
出版社:TechScience Publications
摘要:The present paper describes various syntactic sentence fusion techniques for Bengali language that belongs to the Indo-Aryan language family. Firstly a clause identification and classification system marks clause boundaries and classifies them as principle clause and subordinate clauses. A rule-based sentence classification system has been developed to categorize sentences as simple, complex and compound. The final syntactic sentence fusion system makes use of the sentence class and the clause types and finally fuses two textually entailed sentences using verb paradigm information and noun morphological information. The system outputs are compared with a gold standard data set using manual evaluation and BLEU techniques. The evaluation results yield good accuracy scores. The syntactic sentence fusion technique developed in the present work may be applied for other Indian languages.
关键词:Clause Identification and Classification; Sentence;Type; Syntactic Sentence Fusion; Evaluation