出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:Background: Assessment of liver fibrosis is a vital need for enabling therapeutic decisions
and prognostic evaluations of chronic hepatitis. Liver biopsy is considered the definitive
investigation for assessing the stage of liver fibrosis but it carries several limitations. FIB-4 and
APRI also have a limited accuracy. The National Committee for Control of Viral Hepatitis
(NCCVH) in Egypt has supplied a valuable pool of electronic patients’ data that data mining
techniques can analyze to disclose hidden patterns, trends leading to the evolution of predictive
algorithms.
Aim: to collaborate with physicians to develop a novel reliable, easy to comprehend noninvasive
model to predict the stage of liver fibrosis utilizing routine workup, without imposing extra costs
for additional examinations especially in areas with limited resources like Egypt.
Methods: This multi-centered retrospective study included baseline demographic, laboratory,
and histopathological data of 69106 patients with chronic hepatitis C. We started by data
collection preprocessing, cleansing and formatting for knowledge discovery of useful information
from Electronic Health Records EHRs. Data mining has been used to build a decision tree
(Reduced Error Pruning tree (REP tree)) with 10-fold internal cross-validation. Histopathology
results were used to assess accuracy for fibrosis stages. Machine learning feature selection and
reduction (CfsSubseteval / best first) reduced the initial number of input features (N=15) to the
most relevant ones (N=6) for developing the prediction model.
关键词:Liver Fibrosis; Data Mining; Weka; Decision Tree; Attribute Reduction; Tree Pruning.