摘要:The NAEP EDM Competition required participants to predict efficient test-taking behavior based on log data. This paper describes our top-down approach for engineering features by means of psychometric modeling, aiming at machine learning for the predictive classification task. For feature engineering, we employed, among others, the Log-Normal Response Time Model for estimating latent person speed, and the Generalized Partial Credit Model for estimating latent person ability. Additionally, we adopted an n-gram feature approach for event sequences. Furthermore, instead of using the provided binary target label, we distinguished inefficient test takers who were going too fast and those who were going too slow for training a multi-label classifier. Our best-performing ensemble classifier comprised three sets of low-dimensional classifiers, dominated by test-taker speed. While our classifier reached moderate performance, relative to the competition leaderboard, our approach makes two important contributions. First, we show how classifiers that contain features engineered through literature-derived domain knowledge can provide meaningful predictions if results can be contextualized to test administrators who wish to intervene or take action. Second, our re-engineering of test scores enabled us to incorporate person ability into the models. However, ability was hardly predictive of efficient behavior, leading to the conclusion that the target label's validity needs to be questioned. Beyond competition-related findings, we furthermore report a state sequence analysis for demonstrating the viability of the employed tools. The latter yielded four different test-taking types that described distinctive differences between test takers, providing relevant implications for assessment practice.