摘要:Automatic classification methods applied to sky surveys have revolutionized the astronomical target selection process. Most surveys generate a vast amount of time series, or “lightcurves,” that represent the brightness variability of stellar objects in time. Unfortunately, lightcurves’ observations take several years to be completed, producing truncated time series that generally remain without the application of automatic classifiers until they are finished. This happens because state-of-the-art methods rely on a variety of statistical descriptors or features that present an increasing degree of dispersion when the number of observations decreases, which reduces their precision. In this paper, we propose a novel method that increases the performance of automatic classifiers of variable stars by incorporating the deviations that scarcity of observations produces. Our method uses Gaussian process regression to form a probabilistic model of each lightcurve’s observations. Then, based on this model, bootstrapped samples of the time series features are generated. Finally, a bagging approach is used to improve the overall performance of the classification. We perform tests on the MAssive Compact Halo Object (MACHO) and Optical Gravitational Lensing Experiment (OGLE) catalogs, results show that our method effectively classifies some variability classes using a small fraction of the original observations. For example, we found that RR Lyrae stars can be classified with ~80% accuracy just by observing the first 5% of the whole lightcurves’ observations in the MACHO and OGLE catalogs. We believe these results prove that, when studying lightcurves, it is important to consider the features’ error and how the measurement process impacts it.
关键词:methods: data analysis;stars: statistics;surveys