摘要:Huntington's disease (HD) is a progressive neurodegenerative disorder caused by an expansion of CAG repeats in the IT15 gene. The age-at-onset (AAO) of HD is inversely related to the CAG repeat length and the minimum length thought to cause HD is 36. Accurate estimation of the AAO distribution based on CAG repeat length is important for genetic counseling and the design of clinical trials. In the Cooperative Huntington's Observational Research Trial (COHORT) study, the CAG repeat length is known for the proband participants. However, whether a family member shares the huntingtin gene status (CAG expanded or not) with the proband is unknown. In this
work, we use the expectation-maximization (EM) algorithm to handle the missing huntingtin gene information in first-degree family members in COHORT, assuming that a family member has the same CAG length as the proband if the family member carries a huntingtin gene mutation. We perform simulation studies to examine performance of the proposed method and apply the methods to analyze COHORT proband and family combined data. Our analyses reveal that the estimated cumulative risk of HD symptom onset obtained from the combined data is slightly lower than the risk estimated from the proband data alone.