摘要:Variable
selection with a large number of predictors is a very challenging and important
problem in educational and social domains. However, relatively little attention
has been paid to issues of variable selection in longitudinal data with
application to education. Using this longitudinal educational data (Test of
English for International Communication, TOEIC), this study compares multiple regression,
backward elimination, group least selection absolute shrinkage and selection operator
(LASSO), and linear mixed models in terms of their performance in variable selection.
The results from the study show that four different statistical methods contain
different sets of predictors in their models. The linear mixed model (LMM)
provides the smallest number of predictors (4 predictors among a total of 19
predictors). In addition, LMM is the only appropriate method for the repeated
measurement and is the best method with respect to the principal of parsimony.
This study also provides interpretation of the selected model by LMM in the
conclusion using marginal R2.