首页    期刊浏览 2024年11月29日 星期五
登录注册

文章基本信息

  • 标题:Visualization of Data Mining Techniques for the Prediction of Breast Cancer with High Accuracy Rates
  • 作者:Sharma, Vasudev ; Sharma, Vasudev ; Rajasekaran, Raj Kumar
  • 期刊名称:Journal of Computer Science
  • 印刷版ISSN:1549-3636
  • 出版年度:2019
  • 卷号:15
  • 期号:1
  • 页码:118-130
  • DOI:10.3844/jcssp.2019.118.130
  • 出版社:Science Publications
  • 摘要:Breast cancer is one of the leading causes of death in women worldwide. Around one in 30 women are affected by breast cancer. Mammography has helped in detecting breast cancer in the early stages which have reduced mortality. The diagnosis of breast cancer is dependent on a variety of parameters. In this paper, we aim to create the best model for predicting breast cancer through preprocessing, feature extraction, data visualization and prediction using breast cancer data. Various visualization techniques like violin plot, grid plot, swarm plot and heat plot were utilized for proper feature extraction which has improved the accuracy of our results. For the purpose of prediction, we have used algorithms like the random forest, decision tree with single and multiple predictors, along with the commonly used statistical model, logistic regression model. We have also relied on 5-fold cross-validation methods to measure the unbiasedness of the prediction models for performance reasons. An analysis of the models was carried out and the best model was selected based on its accuracy. The results showcased that the random forest model provided an accuracy rate of 94.724% with decent 5-fold cross-validation, followed by the decision tree model which had an accuracy rate of 100% with poor 5-fold cross-validation. This was followed by the logistic regression model which had an accuracy rate of 88.442% with a low 5-fold cross-validation score.
  • 关键词:Mammography; Data Visualization; Violin Plot; Swarm Plot; Random Forest; Logistic Regression; Decision Tree; 5-Fold Cross Validation
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有