文章基本信息

标题：Delexicalized Word Embeddings for Cross-lingual Dependency Parsing
本地全文：下载
作者：Mathieu Dehouck ; Pascal Denis
期刊名称：Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度：2017
卷号：2017
页码：241-250
语种：English
出版社：ACL Anthology
摘要：This paper presents a new approach to the problem of cross-lingual dependency parsing, aiming at leveraging training data from different source languages to learn a parser in a target language. Specifically, this approach first constructs word vector representations that exploit structural (i.e., dependency-based) contexts but only considering the morpho-syntactic information associated with each word and its contexts. These delexicalized word embeddings, which can be trained on any set of languages and capture features shared across languages, are then used in combination with standard language-specific features to train a lexicalized parser in the target language. We evaluate our approach through experiments on a set of eight different languages that are part the Universal Dependencies Project. Our main results show that using such delexicalized embeddings, either trained in a monolingual or multilingual fashion, achieves significant improvements over monolingual baselines.