摘要:Under the global health crisis of COVID-19, timely, and accurate epidemic data are important for observation, monitoring, analyzing, modeling, predicting, and mitigating impacts. Viral case data can be jointly analyzed with relevant factors for various applications in the context of the pandemic. Current COVID-19 case data are scattered across a variety of data sources which may consist of low data quality accompanied by inconsistent data structures. To address this shortcoming, a multi-scale spatiotemporal data product is proposed as a public repository platform, based on a spatiotemporal cube, and allows the integration of different data sources by adopting various data standards. Within the spatiotemporal cube, a comprehensive data processing workflow gathers disparate COVID-19 epidemic datasets at the global, national, provincial/state, county, and city levels. This proposed framework is supported by an automatic update with a 2-h frequency and the crowdsourcing validation team to produce and update data on a daily time step. This rapid-response dataset allows the integration of other relevant socio-economic and environmental factors for spatiotemporal analysis. The data is available in Harvard Dataverse platform ( https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/8HGECN ) and GitHub open source repository ( https://github.com/stccenter/COVID-19-Data ).
关键词:COVID-19 pandemic ; public health ; semi-automatic validation ; spatiotemporal data set