摘要:Background: Hypothyroidism is one of the endocrine diseases found in human being, it is not a immediate fatal disease, but progress in chronic status that lead to other diseases. Several machine learning techniques were applied to hypothyroidism for the prediction of hypothyroid medical diseases.Methods: The dataset used for UCI repository which has 3163 observations with 151 belongs to category hypothyroid disease. Considered with two missing data method, four imbalance data method and two kinds of classification models by using serval evaluation indexes to find a better model.Results: After comparison of these models, query_on_thyroxine and lithium were ignored and TBG was reconsidered in the model. A new RF imputation method was used. Finally, all variables model have 100% accuracy but cost time and money, only use TSH, FTI, TBG, TT4 and age variables model also can keep 0.9988 AUC, which will more useful in real case.Conclusion: Both all variables mode and five variables model have very high accuracy than previous studies, a nonparametric ensemble model is suggest in this case. But there also have some limitations like out-of-date data, hypothyroidisms type were not considered.