摘要:Drug-induced liver injury (DILI) presents a significant challenge to drug development and regulatory science. The FDA's Liver Toxicity Knowledge Base (LTKB) evaluated >1000 drugs for their likelihood of causing DILI in humans, of which >700 drugs were classified into three categories (most-DILI, less-DILI, and no-DILI). Based on this dataset, we developed and compared 2-class and 3-class DILI prediction models using the machine learning algorithm of Decision Forest (DF) with Mold2 structural descriptors. The models were evaluated through 1000 iterations of 5-fold cross-validations, 1000 bootstrapping validations and 1000 permutation tests (that assessed the chance correlation). Furthermore, prediction confidence analysis was conducted, which provides an additional parameter for proper interpretation of prediction results. We revealed that the 3-class model not only had a higher resolution to estimate DILI risk but also showed an improved capability to differentiate most-DILI drugs from no-DILI drugs in comparison with the 2-class DILI model. We demonstrated the utility of the models for drug ingredients with warnings very recently issued by the FDA. Moreover, we identified informative molecular features important for assessing DILI risk. Our results suggested that the 3-class model presents a better option than the binary model (which most publications are focused on) for drug safety evaluation.