摘要:Many regularization schemes for high-dimensional regression have been put forward. Most require the choice of a tuning parameter, using model selection criteria or cross-validation. We show that a simple sign-constrained least squares estimation is a very simple and effective regularization technique for a certain class of high-dimensional regression problems. The sign constraint has to be derived via prior knowledge or an initial estimator. The success depends on conditions that are easy to check in practice. A sufficient condition for our results is that most variables with the same sign constraint are positively correlated. For a sparse optimal predictor, a non-asymptotic bound on the $\ell_{1-error of the regression coefficients is then proven. Without using any further regularization, the regression vector can be estimated consistently as long as $s^{2}\log(p)/n\rightarrow 0$ for $n\rightarrow\infty$, where $s$ is the sparsity of the optimal regression vector, $p$ the number of variables and $n$ sample size. The bounds are almost as tight as similar bounds for the Lasso for strongly correlated design despite the fact that the method does not have a tuning parameter and does not require cross-validation. Network tomography is shown to be an application where the necessary conditions for success of sign-constrained least squares are naturally fulfilled and empirical results confirm the effectiveness of the sign constraint for sparse recovery if predictor variables are strongly correlated.
关键词:Variable selection;shrinkage estimators;quadratic programming;high-dimensional linear models.