首页    期刊浏览 2024年12月04日 星期三
登录注册

文章基本信息

  • 标题:BERTić - The Transformer Language Model forBosnian,Croatian,Montenegrin andSerbian
  • 本地全文:下载
  • 作者:Nikola Ljubešić ; Davor Lauc
  • 期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
  • 出版年度:2021
  • 卷号:2021
  • 页码:37-42
  • 语种:English
  • 出版社:ACL Anthology
  • 摘要:In this paper we describe a transformer model pre-trained on 8 billion tokens of crawled text from the Croatian, Bosnian, Serbian and Montenegrin web domains. We evaluate the transformer model on the tasks of part-of-speech tagging, named-entity-recognition, geo-location prediction and commonsense causal reasoning, showing improvements on all tasks over state-of-the-art models. For commonsense reasoning evaluation we introduce COPA-HR - a translation of the Choice of Plausible Alternatives (COPA) dataset into Croatian. The BERTić model is made available for free usage and further task-specific fine-tuning through HuggingFace.
国家哲学社会科学文献中心版权所有