首页    期刊浏览 2024年12月02日 星期一
登录注册

文章基本信息

  • 标题:Statistical-based database fingerprint: chemical space dependent representation of compound databases
  • 作者:Norberto Sánchez-Cruz ; José L. Medina-Franco
  • 期刊名称:Journal of Cheminformatics
  • 印刷版ISSN:1758-2946
  • 电子版ISSN:1758-2946
  • 出版年度:2018
  • 卷号:10
  • 期号:1
  • 页码:55
  • DOI:10.1186/s13321-018-0311-x
  • 语种:English
  • 出版社:BioMed Central
  • 摘要:Simplified representation of compound databases has several applications in cheminformatics. Herein, we introduce an alternative and general method to build single fingerprint representations of compound databases. The approach is inspired on the previously published modal fingerprints that are aimed to capture the most significant bits of a fingerprint representation for a compound data set. The novelty of the herein proposed statistical-based database fingerprint (SB-DFP) is that it is generated based on binomial proportions comparisons taking as reference the distribution of “1” bits on a large representative set of the chemical space. To illustrate the Method, SB-DFPs were constructed for 28 epigenetic target data sets retrieved from a recently published epigenomics database of interest in probe and drug discovery. For each target data set, the SB-DFPs were built based on two representative fingerprints of different design using as reference a data set with more than 15 million compounds from ZINC. The application of SB-DFP was illustrated and compared to other methods through association relationships of the 28 epigenetic data sets and similarity searching. It was found that SB-DFPs captured overall, the common features between data sets and the distinct features of each set. In similarity searching SB-DFP equaled or outperformed other approaches for at least 20 out of the 28 sets. SB-DFP is a general approach based on binomial proportion comparisons to represent a compound data set with a single fingerprint. SB-DFP can be developed, at least in principle, based on any fingerprint and reference data set. SB-DFP is a good alternative for exploration of relationships between targets through its associated compound data sets and performing similarity searching.
  • 关键词:Chemical space ; Epi-informatics ; Molecular fingerprints ; Representation ; Similarity searching
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有