摘要:Journal editors and conference organizers frequently rely on peer-reviewing to assess the quality of submissions. Peer-reviewing is a technique in which independent colleagues with expertise in the same area of research rate the submission. The present study investigates the reliability of review ratings by different reviewers. To that end we studied the reviews made for the general conference of the German Communication Association (DGPuK) and the annual conferences of its five largest divisions (Fachgruppen) in the past five conferences. Based on 3537 reviews from 23 conferences, we analyze inter-rater reliability (Krippendorff’s á und Brennan und Prediger’s ê) and ranges, regarding both criteriabased scores (fit with conference theme, innovativeness, relevance, theory, method, clarity of presentation) and overall scores. The study shows that there is substantial disagreement between reviewers. This applies to overall scores as well as criteria-based scores. Calculating mean or sum scores across criteria leads to higher agreement between reviewers. We discuss potential modifications to optimize review procedures.