首页    期刊浏览 2025年02月28日 星期五
登录注册

文章基本信息

  • 标题:Pan-cancer classification by regularized multi-task learning
  • 本地全文:下载
  • 作者:Sk Md Mosaddek Hossain ; Lutfunnesa Khatun ; Sumanta Ray
  • 期刊名称:Scientific Reports
  • 电子版ISSN:2045-2322
  • 出版年度:2021
  • 卷号:11
  • DOI:10.1038/s41598-021-03554-8
  • 语种:English
  • 出版社:Springer Nature
  • 摘要:Classifying pan-cancer samples using gene expression patterns is a crucial challenge for the accurate diagnosis and treatment of cancer patients. Machine learning algorithms have been considered proven tools to perform downstream analysis and capture the deviations in gene expression patterns across diversified diseases. In our present work, we have developed PC-RMTL, a pan-cancer classification model using regularized multi-task learning (RMTL) for classifying 21 cancer types and adjacent normal samples using RNASeq data obtained from TCGA. PC-RMTL is observed to outperform when compared with five state-of-the-art classification algorithms, viz. SVM with the linear kernel (SVM-Lin), SVM with radial basis function kernel (SVM-RBF), random forest (RF), k-nearest neighbours (kNN), and decision trees (DT). The PC-RMTL achieves 96.07% accuracy and 95.80% MCC score for a completely unknown independent test set. The only method that appears as the real competitor is SVM-Lin, which nearly equalizes the accuracy in prediction of PC-RMTL but only when complete feature sets are provided for training; otherwise, PC-RMTL outperformed all other classification models. To the best of our knowledge, this is a significant improvement over all the existing works in pan-cancer classification as they have failed to classify many cancer types from one another reliably. We have also compared gene expression patterns of the top discriminating genes across the cancers and performed their functional enrichment analysis that uncovers several interesting facts in distinguishing pan-cancer samples.
国家哲学社会科学文献中心版权所有