摘要:A key parameter in population genetics is the scaled mutation rate θ = 4 N μ , where N is the effective haploid population size and μ is the mutation rate per haplotype per generation. While exact likelihood inference is notoriously difficult in population genetics, we propose a novel approach to compute a first order accurate likelihood of θ that is based on dynamic programming under the infinite sites model without recombination. The parameter θ may be either constant, i.e., time-independent, or time-dependent, which allows for changes of demography and deviations from neutral equilibrium. For time-independent θ, the performance is compared to the approach in Griffiths and Tavaré’s work “Simulating Probability Distributions in the Coalescent” (Theor. Popul. Biol. 1994, 46, 131–159) that is based on importance sampling and implemented in the “genetree” program. Roughly, the proposed method is computationally fast when n × θ < 100 , where n is the sample size. For time-dependent θ ( t ) , we analyze a simple demographic model with a single change in θ ( t ) . In this case, the ancestral and current θ need to be estimated, as well as the time of change. To our knowledge, this is the first accurate computation of a likelihood in the infinite sites model with non-equilibrium demography.
关键词:likelihood inference; population genetics; dynamic programming; scaled mutation rate; population demography likelihood inference ; population genetics ; dynamic programming ; scaled mutation rate ; population demography