首页    期刊浏览 2025年02月28日 星期五
登录注册

文章基本信息

  • 标题:Model-based clustering with envelopes
  • 本地全文:下载
  • 作者:Wenjing Wang ; Xin Zhang ; Qing Mai
  • 期刊名称:Electronic Journal of Statistics
  • 印刷版ISSN:1935-7524
  • 出版年度:2020
  • 卷号:14
  • 期号:1
  • 页码:82-109
  • DOI:10.1214/19-EJS1652
  • 语种:English
  • 出版社:Institute of Mathematical Statistics
  • 摘要:Clustering analysis is an important unsupervised learning technique in multivariate statistics and machine learning. In this paper, we propose a set of new mixture models called CLEMM (in short for Clustering with Envelope Mixture Models) that is based on the widely used Gaussian mixture model assumptions and the nascent research area of envelope methodology. Formulated mostly for regression models, envelope methodology aims for simultaneous dimension reduction and efficient parameter estimation, and includes a very recent formulation of envelope discriminant subspace for classification and discriminant analysis. Motivated by the envelope discriminant subspace pursuit in classification, we consider parsimonious probabilistic mixture models where the cluster analysis can be improved by projecting the data onto a latent lower-dimensional subspace. The proposed CLEMM framework and the associated envelope-EM algorithms thus provide foundations for envelope methods in unsupervised and semi-supervised learning problems. Numerical studies on simulated data and two benchmark data sets show significant improvement of our propose methods over the classical methods such as Gaussian mixture models, K-means and hierarchical clustering algorithms. An R package is available at https://github.com/kusakehan/CLEMM.
  • 关键词:Clustering; computational statistics; dimension reduction; envelope methods; Gaussian mixture models
国家哲学社会科学文献中心版权所有