首页    期刊浏览 2025年03月04日 星期二
登录注册

文章基本信息

  • 标题:Expanding a Database of Portuguese Tweets
  • 本地全文:下载
  • 作者:Gaspar Brogueira ; Fernando Batista ; Jo{\~a}o Paulo Carvalho
  • 期刊名称:OASIcs : OpenAccess Series in Informatics
  • 电子版ISSN:2190-6807
  • 出版年度:2014
  • 卷号:38
  • 页码:275-282
  • DOI:10.4230/OASIcs.SLATE.2014.275
  • 出版社:Schloss Dagstuhl -- Leibniz-Zentrum fuer Informatik
  • 摘要:This paper describes an existing database of geolocated tweets that were produced in Portuguese regions and proposes an approach to further expand it. The existing database covers eight consecutive days of collected tweets, totaling about 300 thousand tweets, produced by about 11 thousand different users. A detailed analysis on the content of the messages suggests a predominance of young authors that use Twitter as a way of reaching their colleagues with their feelings, ideas and comments. In order to further characterize this community of young people, we propose a method for retrieving additional tweets produced by the same set of authors already in the database. Our goal is to further extend the knowledge about each user of this community, making it possible to automatically characterize each user by the content he/she produces, cluster users and open other possibilities in the scope of social analysis.
  • 关键词:Twitter; corpus of Portuguese tweets; Twitter API; natural language processing; text analysis
国家哲学社会科学文献中心版权所有