摘要:The healthcare ambit is usually perceived as “information rich” yet “knowledge poor”. Nowadays, an unprecedented effort is underway to increase the use of business intelligence techniques to solve this problem. Heart disease (HD) is a major cause of mortality in modern society. This paper analyzes the risk factors that have been identified in cardiovascular disease (CVD) surveillance systems. The Heart Care study identifies attributes related to CVD risk (gender, age, smoking habit, etc.) and other dependent variables that include a specific form of CVD (diabetes, hypertension, cardiac disease, etc.). In this paper, we combine Clustering, Association Rules, and Neural Networks for the assessment of heart-event-related risk factors, targeting the reduction of CVD risk. With the use of the K-means algorithm, significant groups of patients are found. Then, the Apriori algorithm is applied in order to understand the kinds of relations between the attributes within the dataset, first looking within the whole dataset and then refining the results through the subsets defined by the clusters. Finally, both results allow us to better define patients’ characteristics in order to make predictions about CVD risk with a Multilayer Perceptron Neural Network. The results obtained with the hybrid information mining approach indicate that it is an effective strategy for knowledge discovery concerning chronic diseases, particularly for CVD risk.
关键词:information mining; knowledge discovery; data exploration; data mining; big and rich data information mining ; knowledge discovery ; data exploration ; data mining ; big and rich data