期刊名称:Journal of Computer Science & Systems Biology
印刷版ISSN:0974-7230
出版年度:2009
卷号:2
期号:3
页码:180-185
DOI:10.4172/jcsb.1000030
出版社:OMICS Publishing Group
摘要:Tandem repeats (TR) are the most abundant ones in the extragenic region of genomes. Biologists have already found a large number of regulatory elements in this region. These elements may profoundly impact the chromatin structure formation in nucleus and also contain important clues in genetic evolution and phylogenic study. This study attempts to mine rules on how combinations of individual binding sites are distributed tandem repeats in human genome (http://www.trbase2.cn). The association rules mined would facilitate efforts to identify gene classes regulated by similar mechanisms and accurately predict regulatory elements. Herein, the combinations of transcription factor binding sites in the tandem repeats are obtained and, then, data mining techniques are applied to mine the association rules from the combinations of binding sites. In addition, the discovered associations are further pruned to remove those insignificant associations and obtain a set of discovered associations.
关键词:Human tandem repeats; TRANSFAC database; Transcription factor binding sites; Data mining; Association rules