Matched case-control studies are commonly employed in epidemiological research for their convenience and efficiency. Analysis of secondary outcomes can yield valuable insights into biological pathways and help identify genetic variants of importance. Naive analysis using standard statistical methods, such as least-squares regression for quantitative traits, can be misleading because they fail to account for unequal sampling induced by the case-control design and matching. In this paper, we propose novel statistical methods that appropriately reflect the study design and sampling scheme in the analysis of secondary outcome data. The new methods provide consistent estimation and accurate coverage probabilities for the confidence interval estimators. We demonstrate the advantages of the new methods through simulation studies and a real application with diabetes patients. R code implementing the proposed methods is publicly available.
翻译:匹配病例对照研究因其便利性和高效性在流行病学研究中被广泛采用。对次要结局的分析能够为生物学通路提供有价值的见解,并有助于识别重要的遗传变异。使用标准统计方法(如针对数量性状的最小二乘回归)进行朴素分析可能会产生误导,因为这些方法未能考虑病例对照设计和匹配所导致的不等抽样。本文提出了一种新颖的统计方法,能够在分析次要结局数据时恰当地反映研究设计和抽样方案。新方法为置信区间估计量提供了一致性估计和准确的覆盖概率。我们通过模拟研究和一项糖尿病患者实际应用,展示了新方法的优势。实现所提方法的R代码已公开提供。