期刊名称:Proceedings of the National Academy of Sciences
印刷版ISSN:0027-8424
电子版ISSN:1091-6490
出版年度:2014
卷号:111
期号:46
页码:16262-16267
DOI:10.1073/pnas.1314814111
语种:English
出版社:The National Academy of Sciences of the United States of America
摘要:SignificanceThe use of big data is becoming a central way of discovering knowledge in modern science. Large amounts of potential findings are screened to discover the few real ones. To verify these discoveries a follow-up study is often conducted, wherein only the promising discoveries are followed up. Such follow-up studies are common in genomics, in proteomics, and in other areas where high-throughput methods are used. We show how to decide whether promising findings from the preliminary study are replicated by the follow-up study, keeping in mind that the preliminary study involved an extensive search for rare true signal in a vast amount of noise. The proposal computes a number, the r value, to quantify the strength of replication. We propose a formal method to declare that findings from a primary study have been replicated in a follow-up study. Our proposal is appropriate for primary studies that involve large-scale searches for rare true positives (i.e., needles in a haystack). Our proposal assigns an r value to each finding; this is the lowest false discovery rate at which the finding can be called replicated. Examples are given and software is available.
关键词:false discovery rate ; genome-wide association studies ; metaanalysis ; multiple comparisons ; r value