A generalization of the classical concordance correlation coefficient (CCC) is considered under a three-level design where multiple raters rate every subject over time, and each rater is rating every subject multiple times at each measuring time point. The ratings can be discrete or continuous. A methodology is developed for the interval estimation of the CCC based on a suitable linearization of the model along with an adaptation of the fiducial inference approach. The resulting confidence intervals have satisfactory coverage probabilities and shorter expected widths compared to the interval based on Fisher Z-transformation, even under moderate sample sizes. Two real applications available in the literature are discussed. The first application is based on a clinical trial to determine if various treatments are more effective than a placebo for treating knee pain associated with osteoarthritis. The CCC was used to assess agreement among the manual measurements of the joint space widths on plain radiographs by two raters, and the computer-generated measurements of digitalized radiographs. The second example is on a corticospinal tractography, and the CCC was once again applied in order to evaluate the agreement between a well-trained technologist and a neuroradiologist regarding the measurements of fiber number in both the right and left corticospinal tracts. Other relevant applications of our general approach are highlighted in many areas including artificial intelligence.
翻译:本文针对经典一致性相关系数(CCC)提出了一种推广形式,适用于三级设计场景:多位评分者在不同时间点对每个受试对象进行评价,且每位评分者在每个测量时间点对每个受试对象进行多次评价。评分结果可以是离散型或连续型数据。我们基于模型的适当线性化方法,结合基准推断思想的适应性改进,开发了一套用于CCC区间估计的方法论。与基于Fisher Z变换的区间估计相比,所得置信区间在中等样本量下仍能保持理想的覆盖概率,且具有更短的期望宽度。本文讨论了文献中两个实际应用案例:首个案例基于一项临床试验,旨在评估多种治疗方案对骨关节炎相关膝关节疼痛的疗效是否优于安慰剂。该研究采用CCC评估两位评分者对平片关节间隙宽度的人工测量结果与数字化X光片的计算机生成测量结果之间的一致性。第二个案例涉及皮质脊髓束纤维追踪研究,再次应用CCC来评估训练有素的技术人员与神经放射科医师在左右两侧皮质脊髓束纤维数量测量结果的一致性。本文还强调了该通用方法在人工智能等众多领域的其他潜在应用价值。