摘要:Data mining is a technique, wherein the historical data is explored in search of a systematic relationship between variables and/or have a consistent pattern. This relationship is utilized to validate the outcomes by applying the identified patterns onto new data subsets. This paper compares three predictive data-mining techniques, namelymultiple linear regression, principal component regression and the partial least squares ona unique dataset. This data is unique, having a characteristics combination of presence of outliers, highly collinear variables,very redundant variables and predictor variables. In the initial step after pre-preparing information, negligible number of factors are chosen that can totally anticipate the reaction variable. These diverse information mining strategies, which has distinctive use techniques were actualized on the total informational index and the best strategy in every procedure was distinguished and this is utilized for worldwide examination with different systems for similar information.
关键词:Multiple linear regressions; Principal component regression; Partial least squares