期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2012
卷号:2012
出版社:ACL Anthology
摘要:Sentence Similarity is the process of computing
a similarity score between two sentences.
Previous sentence similarity work finds that
latent semantics approaches to the problem do
not perform well due to insufficient information
in single sentences. In this paper, we
show that by carefully handling words that
are not in the sentences (missing words), we
can train a reliable latent variable model on
sentences. In the process, we propose a new
evaluation framework for sentence similarity:
Concept Definition Retrieval. The new framework
allows for large scale tuning and testing
of Sentence Similarity models. Experiments
on the new task and previous data sets
show significant improvement of our model
over baselines and other traditional latent variable
models. Our results indicate comparable
and even better performance than current state
of the art systems addressing the problem of
sentence similarity.