摘要:Several techniques have been developed in recent years to generate optimal large-scale assessments (LSAs) of student achievement. These techniques often represent a blend of procedures from such diverse fields as experimental design, combinatorial optimization, particle physics, or neural networks. However, despite the theoretical advances in the field, there still exists a surprising scarcity of well-documented test designs in which all factors that have guided design decisions are explicitly and clearly communicated. This paper therefore has two goals. First, a brief summary of relevant key terms, as well as experimental designs and automated test assembly routines in LSA, is given. Second, conceptual and methodological steps in designing the assessment of the Austrian educational standards in mathematics are described in detail. The test design was generated using a two-step procedure, starting at the item block level and continuing at the item level. Initially, a partially balanced incomplete item block design was generated using simulated annealing, whereas in a second step, items were assigned to the item blocks using mixed-integer linear optimization in combination with a shadow-test approach. Keywords:  Educational assessment , large-scale assessment , optimal design , automated test assembly , simulated annealing