摘要:Objective The purpose of this study was to investigate the use of large-scale medical claims data for local surveillance of under-immunization for childhood infections in the United States, to develop a statistical framework for integrating disparate data sources on surveillance of vaccination behavior, and to identify the determinants of vaccine hesitancy behavior. Introduction In the United States, surveillance of vaccine uptake for childhood infections is limited in scope and spatial resolution. The National Immunization Survey (NIS) - the gold standard tool for monitoring vaccine uptake among children aged 19-35 months - is typically constrained to producing coarse state-level estimates. 1 In recent years, vaccine hesitancy (i.e., a desire to delay or refuse vaccination, despite availability of vaccination services) 2 has resurged in the United States, challenging the maintenance of herd immunity. In December 2014, foreign importation of the measles virus to Disney theme parks in Orange County, California resulted in an outbreak of 111 measles cases, 45% of which were among unvaccinated individuals. 3 Digital health data offer new opportunities to study the social determinants of vaccine hesitancy in the United States and identify finer spatial resolution clusters of under-immunization using data with greater clinical accuracy and rationale for hesitancy. 4 Methods Our U.S. medical claims data comprised monthly reports of diagnosis codes for under-immunization and vaccine refusal (Figure 1). These claims were aggregated to five-digit zip-codes by patient age-group from 2012 to 2015. Spatial generalized linear mixed models were used to generate county-level maps for surveillance of under-immunization and to identify the determinants of vaccine hesitancy, such as income, education, household size, religious group representation, and healthcare access. We developed a Bayesian modeling framework that separates the observation of vaccine hesitancy in our data from true underlying rates of vaccine hesitancy in the community. Our model structure also enabled us to borrow information from neighboring counties, which improves prediction of vaccine hesitancy in areas with missing or minimal data. Estimates of the posterior distributions of model parameters were generated via Markov chain Monte Carlo (MCMC) methods. Results Our modeling framework enabled the production of county-level maps of under-immunization and vaccine refusal in the United States between 2012-2015, the identification of geographic clusters of under-immunization, and the quantification of the association between various epidemiological factors and vaccination status. In addition, we found that our model structure enabled us to account for spatial variation in reporting vaccine hesitancy, which improved our estimation. Conclusions Our work demonstrate the utility of using large-scale medical claims data to improve surveillance systems for vaccine uptake and to assess the social and ecological determinants of vaccine hesitancy. We describe a flexible, hierarchical modeling framework for integrating disparate data sources, particularly for data collected through different measurement processes or at different spatial scales. Our findings will enhance our understanding of the causes of under- immunization, inform the design of vaccination policy, and aid in the development of targeted public health strategies for optimizing vaccine uptake. Figure 1. Instances of vaccine refusal (per 100,000 population) for United States counties in 2014 as observed in medical claims data.