文章基本信息

标题：Bayesian Unsupervised Learning of DNA Regulatory Binding Regions
本地全文：下载
作者：Jukka Corander ; Magnus Ekdahl ; Timo Koski 等
期刊名称：Advances in Artificial Intelligence
印刷版ISSN：1687-7470
电子版ISSN：1687-7489
出版年度：2009
卷号：2009
DOI：10.1155/2009/219743
出版社：Hindawi Publishing Corporation
摘要：Identification of regulatory binding motifs, that is, short specific words, within DNA sequences is a commonly occurring problem in computational bioinformatics. A wide variety of probabilistic approaches have been proposed in the literature to either scan for previously known motif types or to attempt de novo identification of a fixed number (typically one) of putative motifs. Most approaches assume the existence of reliable biodatabase information to build probabilistic a priori description of the motif classes. Examples of attempts to do probabilistic unsupervised learning about the number of putative de novo motif types and their positions within a set of DNA sequences are very rare in the literature. Here we show how such a learning problem can be formulated using a Bayesian model that targets to simultaneously maximize the marginal likelihood of sequence data arising under multiple motif types as well as under the background DNA model, which equals a variable length Markov chain. It is demonstrated how the adopted Bayesian modelling strategy combined with recently introduced nonstandard stochastic computation tools yields a more tractable learning procedure than is possible with the standard Monte Carlo approaches. Improvements and extensions of the proposed approach are also discussed.