首页    期刊浏览 2024年12月13日 星期五
登录注册

文章基本信息

  • 标题:Reflections on the Penn Discourse TreeBank, Comparable Corpora, and Complementary Annotation
  • 本地全文:下载
  • 作者:Rashmi Prasad ; Bonnie Webber ; Aravind Joshi
  • 期刊名称:Computational Linguistics
  • 印刷版ISSN:0891-2017
  • 电子版ISSN:1530-9312
  • 出版年度:2014
  • 卷号:40
  • 期号:4
  • 页码:921-950
  • DOI:10.1162/COLI_a_00204
  • 语种:English
  • 出版社:MIT Press
  • 摘要:The Penn Discourse Treebank (PDTB) was released to the public in 2008. It remains the largest manually annotated corpus of discourse relations to date. Its focus on discourse relations that are either lexically-grounded in explicit discourse connectives or associated with sentential adjacency has not only facilitated its use in language technology and psycholinguistics but also has spawned the annotation of comparable corpora in other languages and genres. Given this situation, this paper has four aims: (1) to provide a comprehensive introduction to the PDTB for those who are unfamiliar with it; (2) to correct some wrong (or perhaps inadvertent) assumptions about the PDTB and its annotation that may have weakened previous results or the performance of decision procedures induced from the data; (3) to explain variations seen in the annotation of comparable resources in other languages and genres, which should allow developers of future comparable resources to recognize whether the variations are relevant to them; and (4) to enumerate and explain relationships between PDTB annotation and complementary annotation of other linguistic phenomena. The paper draws on work done by ourselves and others since the corpus was released.
国家哲学社会科学文献中心版权所有