首页    期刊浏览 2024年12月12日 星期四
登录注册

文章基本信息

  • 标题:Development of a method for the recognition of author’s style in the Ukrainian language texts based on linguometry, stylemetry and glottochronology
  • 本地全文:下载
  • 作者:Vasyl Lytvyn ; Victoria Vysotska ; Petro Pukach
  • 期刊名称:Eastern-European Journal of Enterprise Technologies
  • 印刷版ISSN:1729-3774
  • 电子版ISSN:1729-4061
  • 出版年度:2017
  • 卷号:4
  • 期号:2
  • 页码:10-19
  • DOI:10.15587/1729-4061.2017.107512
  • 语种:English
  • 出版社:PC Technology Center
  • 摘要:We solved the problem of development of algorithmic software for processes of content monitoring for solving the problem of recognition of the style of an author of a Ukrainian text based on Web Mining and NLP technology. Decomposition of the method for recognition of the style of an author, based of analysis of the found stop words, was carried out. Specific features of the method include adaptation of morphological and syntactic analysis of lexical units to structural peculiarities of words/ texts in Ukrainian. It is syntactic words (stop words or anchor words) that are significant for an author’s individual style, as they are not related to the theme and content of the publication. Recognition of the author's style is based on analysis of coefficients of lexical author’s language: coherence of speech, lexical diversity, syntactic complexity indices of concentration and exclusivity for the author's fragment. They are used for subsequent comparison and determining of a degree of belonging of the analyzed text to a particular author. We studied internal "dynamics" of a text of randomly selected authors through analysis of coefficients of lexical author’s language for the first k, n and m (without the title) words of the author's fragment and the analyzed one. The obtained results were compared. We obtained results of experimental testing of the proposed method for content-monitoring for determining and analysis of stop words in Ukrainian scientific texts of technical area based on Web Mining technology. It was found that for the selected experimental base that contains 100 works, the method for analysis of an article without compulsory initial information and list of references attains the best results by density criterion. It is achieved through learning of the system and by checking specified blocked words and specified thematic vocabulary. Testing of the proposed method for determining of keywords from other categories of texts – of scientific humanitarian area, belles-lettres, journalistic, etc. – requires subsequent experimental research.
  • 关键词:style of the author;statistical linguistic analysis;quantitative linguistics;author's attribution
国家哲学社会科学文献中心版权所有