首页    期刊浏览 2024年12月13日 星期五
登录注册

文章基本信息

  • 标题:Adaptive Dictionary-based Compression of Protein Sequences
  • 本地全文:下载
  • 作者:Akash Nag ; Sunil Karforma
  • 期刊名称:International Journal of Education and Management Engineering(IJEME)
  • 印刷版ISSN:2305-3623
  • 电子版ISSN:2305-8463
  • 出版年度:2017
  • 卷号:7
  • 期号:5
  • 页码:1-6
  • DOI:10.5815/ijeme.2017.05.01
  • 出版社:MECS Publisher
  • 摘要:This paper introduces a simple and fast lossless compression algorithm, called CAD, for the compression of protein sequences. The proposed algorithm is specially suited for compressing proteomes, which are the collection of all proteins expressed by an organism. Maintaining a changing dictionary of actively used amino-acid residues, the algorithm uses the adaptive dictionary together with Huffman coding to achieve an average compression rate of 3.25 bits per symbol, better than most other existing protein-compression and general-purpose compression algorithms known to us. With an average compression ratio of 2.46:1 and an average compression rate of 1.32M residues/sec, our algorithm outperforms every other compression algorithm for compressing protein sequences in terms of the balance in compression-time and compression rate.
  • 关键词:Protein sequence compression;dictionary based compression;huffman encoding
国家哲学社会科学文献中心版权所有