文章基本信息

标题：Applying PCA to Deep Learning Forecasting Models for Predicting PM2.5
本地全文：下载
作者：Sang Won Choi ; Brian H. S. Kim
期刊名称：Sustainability
印刷版ISSN：2071-1050
出版年度：2021
卷号：13
期号：7
页码：3726
DOI：10.3390/su13073726
语种：English
出版社：MDPI, Open Access Journal
摘要：Fine particulate matter (PM2.5) is one of the main air pollution problems that occur in major cities around the world. A country’s PM2.5 can be affected not only by country factors but also by the neighboring country’s air quality factors. Therefore, forecasting PM2.5 requires collecting data from outside the country as well as from within which is necessary for policies and plans. The data set of many variables with a relatively small number of observations can cause a dimensionality problem and limit the performance of the deep learning model. This study used daily data for five years in predicting PM2.5 concentrations in eight Korean cities through deep learning models. PM2.5 data of China were collected and used as input variables to solve the dimensionality problem using principal components analysis (PCA). The deep learning models used were a recurrent neural network (RNN), long short-term memory (LSTM), and bidirectional LSTM (BiLSTM). The performance of the models with and without PCA was compared using root-mean-square error (RMSE) and mean absolute error (MAE). As a result, the application of PCA in LSTM and BiLSTM, excluding the RNN, showed better performance: decreases of up to 16.6% and 33.3% in RMSE and MAE values. The results indicated that applying PCA in deep learning time series prediction can contribute to practical performance improvements, even with a small number of observations. It also provides a more accurate basis for the establishment of PM2.5 reduction policy in the country.
关键词：principal components analysis (PCA); PM2.5; recurrent neural network RNN); long short-term memory (LSTM); bidirectional LSTM (BiLSTM); deep learning