首页    期刊浏览 2025年01月23日 星期四
登录注册

文章基本信息

  • 标题:Predicting Liaison: an Example-Based Approach
  • 本地全文:下载
  • 作者:Antal van den Bosch ; Alexander Greefhorst
  • 期刊名称:Traitement Automatique des Langues
  • 印刷版ISSN:1248-9433
  • 电子版ISSN:1965-0906
  • 出版年度:2016
  • 卷号:57
  • 期号:1
  • 页码:1-20
  • 语种:French
  • 出版社:ATALA - Assoc Traitement Automatique Langues
  • 摘要:Predicting liaison in French is a non-trivial problem to model. We compare a memory-based machine-learning algorithm with a rule-based baseline. The memory-based learner is trained to predict whether liaison occurs between two words on the basis of lexical, orthographic, morphosyntactic, and sociolinguistic features. Best performance is obtained using only a selection of lexical and syntactic features (a window of the five last letters of a word and the five first letters of the following word, whether the liaison is obligatory or optional, Part-of-Speech tags, the number of syllables in a word and the Levenshtein distance to the 20 nearest phonological neighbors. Counter to our expectations, including sociolinguistic features even lowered the precision and recall of our predictions. Selecting only lexical and syntactic features yields a best overall performance at a precision of .80, with recall at .85. The F-scores, the harmonic mean of precision and recall, of the memory-based algorithm are higher than that of a baseline based on the rules of Grevisse and Goosse (2011), IGTree (a decision-tree learner) and the Naive Bayes classifier. Ripper, a more sophisticated rule induction algorithm, was able to produce similar results to our memory-based algorithm, but when it comes to optional liaison contexts, Ripper misses more instances in which real speakers would produce a liaison. It appears that predicting liaison benefits from being able to generalize from specific examples in context.
国家哲学社会科学文献中心版权所有