首页    期刊浏览 2024年12月02日 星期一
登录注册

文章基本信息

  • 标题:Prediction of high anti-angiogenic activity peptides in silico using a generalized linear model and feature selection
  • 本地全文:下载
  • 作者:Jose Liñares Blanco ; Ana B. Porto-Pazos ; Alejandro Pazos
  • 期刊名称:Scientific Reports
  • 电子版ISSN:2045-2322
  • 出版年度:2018
  • 卷号:8
  • 期号:1
  • 页码:15688
  • DOI:10.1038/s41598-018-33911-z
  • 语种:English
  • 出版社:Springer Nature
  • 摘要:Screening and in silico modeling are critical activities for the reduction of experimental costs. They also speed up research notably and strengthen the theoretical framework, thus allowing researchers to numerically quantify the importance of a particular subset of information. For example, in fields such as cancer and other highly prevalent diseases, having a reliable prediction method is crucial. The objective of this paper is to classify peptide sequences according to their anti-angiogenic activity to understand the underlying principles via machine learning. First, the peptide sequences were converted into three types of numerical molecular descriptors based on the amino acid composition. We performed different experiments with the descriptors and merged them to obtain baseline results for the performance of the models, particularly of each molecular descriptor subset. A feature selection process was applied to reduce the dimensionality of the problem and remove noisy features - which are highly present in biological problems. After a robust machine learning experimental design under equal conditions (nested resampling, cross-validation, hyperparameter tuning and different runs), we statistically and significantly outperformed the best previously published anti-angiogenic model with a generalized linear model via coordinate descent (glmnet), achieving a mean AUC value greater than 0.96 and with an accuracy of 0.86 with 200 molecular descriptors, mixed from the three groups. A final analysis with the top-40 discriminative anti-angiogenic activity peptides is presented along with a discussion of the feature selection process and the individual importance of each molecular descriptors According to our findings, anti-angiogenic activity peptides are strongly associated with amino acid sequences SP, LSL, PF, DIT, PC, GH, RQ, QD, TC, SC, AS, CLD, ST, MF, GRE, IQ, CQ and HG.
国家哲学社会科学文献中心版权所有