期刊名称:International Journal of Computer Science and Network Security
印刷版ISSN:1738-7906
出版年度:2010
卷号:10
期号:8
页码:41-46
出版社:International Journal of Computer Science and Network Security
摘要:Vietnamese language has common characteristics with some Asian languages such as Chinese, Japanese, Korean. They do not define words based on spaces. In this article, we present a method that application of Fuzzy set theory and topic model to extract sentences in Vietnamese texts which have been categorized by topic. This method based on identification of important features as, length of sentence, weight of terms in sentences, position of sentences, then extracting important sentences according to the ratio, this ratio indicate which sentences in original text will be extracted. We also built a system based on this method and experiments have obtained good results, satisfying the given requirements.
关键词:Vietnamese text; Sentence extraction; topic model; Fuzzy set theory