文章基本信息

标题：Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models
本地全文：下载
作者：Dimitris Fouskakis ; Ioannis Ntzoufras ; David Draper 等
期刊名称：Bayesian Analysis
印刷版ISSN：1931-6690
电子版ISSN：1936-0975
出版年度：2015
卷号：10
期号：1
页码：75-107
DOI：10.1214/14-BA887
语种：English
出版社：International Society for Bayesian Analysis
摘要：In the context of the expected-posterior prior (EPP) approach to Bayesian variable selection in linear models, we combine ideas from power-prior and unit-information-prior methodologies to simultaneously (a) produce a minimally-informative prior and (b) diminish the effect of training samples. The result is that in practice our power-expected-posterior (PEP) methodology is sufficiently insensitive to the size n* of the training sample, due to PEP’s unit-information construction, that one may take n* equal to the full-data sample size n and dispense with training samples altogether. This promotes stability of the resulting Bayes factors, removes the arbitrariness arising from individual training-sample selections, and greatly increases computational speed, allowing many more models to be compared within a fixed CPU budget. We find that, under an independence Jeffreys (reference) baseline prior, the asymptotics of PEP Bayes factors are equivalent to those of Schwartz’s Bayesian Information Criterion (BIC), ensuring consistency of the PEP approach to model selection. Our PEP prior, due to its unit-information structure, leads to a variable-selection procedure that — in our empirical studies — (1) is systematically more parsimonious than the basic EPP with minimal training sample, while sacrificing no desirable performance characteristics to achieve this parsimony; (2) is robust to the size of the training sample, thus enjoying the advantages described above arising from the avoidance of training samples altogether; and (3) identifies maximum-a-posteriori models that achieve better out-of-sample predictive performance than that provided by standard EPPs, the g-prior, the hyper-g prior, non-local priors, the Least Absolute Shrinkage and Selection Operator (LASSO) and Smoothly-Clipped Absolute Deviation (SCAD) methods.