摘要:We retrieve news stories and earnings announcements of the S&P 100 constituents from two professional news providers, along with ten macroeconomic indicators. We also gather data from Google Trends about these firms’ assets as an index of retail investors’ attention. Thus, we create an extensive and innovative database that contains precise information with which to analyze the link between news and asset price dynamics. We detect the sentiment of news stories using a dictionary of sentiment-related words and negations and propose a set of more than five thousand information-based variables that provide natural proxies for the information used by heterogeneous market players. We first shed light on the impact of information measures on daily realized volatility and select them by penalized regression. Then, we perform a forecasting exercise and show that the model augmented with news-related variables provides superior forecasts.
关键词:volatility; news; Google Trends; sentiment analysis; big data; lasso; regularization