首页    期刊浏览 2025年03月03日 星期一
登录注册

文章基本信息

  • 标题:Bayesian Hyper-LASSO Classification for Feature Selection with Application to Endometrial Cancer RNA-seq Data
  • 本地全文:下载
  • 作者:Lai Jiang ; Celia M. T. Greenwood ; Weixin Yao
  • 期刊名称:Scientific Reports
  • 电子版ISSN:2045-2322
  • 出版年度:2020
  • 卷号:10
  • 期号:1
  • 页码:1-16
  • DOI:10.1038/s41598-020-66466-z
  • 出版社:Springer Nature
  • 摘要:Feature selection is demanded in many modern scientific research problems that use high-dimensional data. A typical example is to identify gene signatures that are related to a certain disease from high-dimensional gene expression data. The expression of genes may have grouping structures, for example, a group of co-regulated genes that have similar biological functions tend to have similar expressions. Thus it is preferable to take the grouping structure into consideration to select features. In this paper, we propose a Bayesian Robit regression method with Hyper-LASSO priors (shortened by BayesHL) for feature selection in high dimensional genomic data with grouping structure. The main features of BayesHL include that it discards more aggressively unrelated features than LASSO, and it makes feature selection within groups automatically without a pre-specified grouping structure. We apply BayesHL in gene expression analysis to identify subsets of genes that contribute to the 5-year survival outcome of endometrial cancer (EC) patients. Results show that BayesHL outperforms alternative methods (including LASSO, group LASSO, supervised group LASSO, penalized logistic regression, random forest, neural network, XGBoost and knockoff) in terms of predictive power, sparsity and the ability to uncover grouping structure, and provides insight into the mechanisms of multiple genetic pathways leading to differentiated EC survival outcome.
国家哲学社会科学文献中心版权所有