出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:In this paper, we propose a method that aligns comparable bilingual tweets which, not onlytakes into account the specificity of a Tweet, but treats also proper names, dates and numbers intwo different languages. This permits to retrieve more relevant target tweets. The process ofmatching proper names between Arabic and English is a difficult task, because these twolanguages use different scripts. For that, we used an approach which projects the sounds of anEnglish proper name into Arabic and aligns it with the most appropriate proper name. Weevaluated the method with a classical measure and compared it to the one we developed. Theexperiments have been achieved on two parallel corpora and shows that our measureoutperforms the baseline by 5.6% at R@1 recall.