首页    期刊浏览 2024年12月12日 星期四
登录注册

文章基本信息

  • 标题:Minimum Spanning Tree-based Clustering Applied to Protein Sequences in Early Cancer Diagnosis
  • 本地全文:下载
  • 作者:Dr. T. Karthikeyan ; S. John Peter ; B. Praburaj
  • 期刊名称:International Journal of Computer Science & Technology
  • 印刷版ISSN:2229-4333
  • 电子版ISSN:0976-8491
  • 出版年度:2012
  • 卷号:3
  • 期号:1Ver4
  • 出版社:Ayushmaan Technologies
  • 摘要:Cancer molecular pattern efficient discovery is essential in the molecular diagnostics. The number of amino acid sequence is increasing very rapidly in the protein databases, but the structure of only some amino acid sequences are found in the protein data bank. Thus an important problem in genomics is automatically clustering homogeneous protein sequences when only sequence information is available. The characteristics of the protein expression data are challenging the traditional unsupervised classification algorithm. In this paper we use Minimum Spanning Tree based clustering algorithm for clustering amino acid sequences. A similarity graph is defined and a cluster in that graph corresponds to connected sub graph. Cluster analysis seeks grouping of amino acid sequence in to subsets based on Euclidean distance between pairs of sequences. Our goal is to find disjoint subsets, called clusters, such that two criteria are satisfied: homogeneity: sequences in the same cluster are highly similar to each other and separation: sequences in the different clusters have low similarities to each other. A thorough understanding of the genes is based on upon having adequate information about the proteins. Solving the protein related problem has become one of the most important challenges in bioinformatics. In bioinformatics, number of protein sequences is more than half million, and it is necessary to find meaningful partition of them in order to detect their functions. The method which can enhance the structural recognition, classification and interpretation of proteins will be advantageous. Many methods have been adopted to solve such bioinformatics problem. Our Minimum Spanning Tree based clustering algorithm is useful and efficient method in the collective study of protein subset. The key feature of the algorithm is ability to predict the 3D structure of the unknown protein sequence.
  • 关键词:Euclidean Minimum Spanning Tree; Subtree; Eccentricity; Center;Hierarchical Clustering; Cluster Validity; Cluster Separation
国家哲学社会科学文献中心版权所有