期刊名称:International Journal of Computer Science and Network Security
印刷版ISSN:1738-7906
出版年度:2006
卷号:6
期号:3A
页码:189-196
出版社:International Journal of Computer Science and Network Security
摘要:This paper provides a novel and efficient method for extracting exact textual answers from the returned documents that are retrieved by traditional IR system in large-scale collection of texts. The main intended contribution of this paper is to propose System Similarity Model (SSM), which can be considered as an extension of vector space model (VSM) to rank passages, and to apply binary logistic regression model (LRM), which seldom be used in IE to extract special information from candidate data sets. The parameters estimated for the data gathered with serious problem of data sparse, therefore we take stratified sampling method, and improve traditional logistic regression model parameters estimated methods. The series of experimental results show that the overall performance of our system is good and our approach is effective. Our system, Insun05QA1, participated in QA track of TREC 2005 and obtained excellent results.
关键词:Answer extraction, System Similarity Model, Stratified sampling, Logistic regression model