摘要:A precise and commonly
accepted definition of paraphrasing
does not exist. This is one of the reasons that have prevented computational linguistics
from a real success when dealing with this phenomenon in its systems and
applications. With the aim of helping to overcome this difficulty, in this article,
new insights on paraphrase characterization are provided. We first overview what has
been said on paraphrasing from linguistics and the new lights shed on the
phenomenon from computational linguistics. Under the light of the shortcomings
observed, the paraphrase phenomenon is studied from two different perspectives. On
the one hand, insights on paraphrase boundaries are set out analyzing paraphrase
borderline cases and the interaction of paraphrasing with related linguistic
phenomena. On the other hand, a new paraphrase typology is presented. It goes
beyond a simple list of types and is embedded in a linguistically-based
hierarchical structure. This typology has been empirically validated through
corpus annotation and its application in the plagiarism-detection domain.