文章基本信息

标题：CUSATNLP@DravidianLangTech-EACL2021:Language Agnostic Classification of Offensive Content in Tweets
本地全文：下载
作者：Sara Renjit ; Sumam Mary Idicula
期刊名称：Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度：2021
卷号：2021
页码：236-242
语种：English
出版社：ACL Anthology
摘要：Identifying offensive information from tweets is a vital language processing task. This task concentrated more on English and other foreign languages these days. In this shared task on Offensive Language Identification in Dravidian Languages, in the First Workshop of Speech and Language Technologies for Dravidian Languages in EACL 2021, the aim is to identify offensive content from code mixed Dravidian Languages Kannada, Malayalam, and Tamil. Our team used language agnostic BERT (Bidirectional Encoder Representation from Transformers) for sentence embedding and a Softmax classifier. The language-agnostic representation based classification helped obtain good performance for all the three languages, out of which results for the Malayalam language are good enough to obtain a third position among the participating teams.