摘要:Under big data, a large number of features, as well as their complex data types, make traditional feature extraction and knowledge reasoning unable to adapt to new conditions. To solve these problems, this study proposes a museum big data feature extraction method based on a similarity mapping algorithm. Under the museum big data analysis, the museum big data text information is collected through web crawler technology. The web crawler is used to index the content of websites all across the Internet so that the museum websites can appear in search engine results and the collected text information is denoised and smoothed by a Gaussian filter to construct the processed text information set mapping matrix. The semantic similarity is computed according to the text word concept. Based on the calculation results, through word frequency and document probability inverse document frequency weight, the museum big data text information features are extracted. Simulation results show that the proposed method has high accuracy and short extraction time. Through the comparative analysis, it can be realized that this method not only solves the problems existing in traditional methods but also lays a foundation for the analysis of museum massive data.