A generalization of the classical concordance correlation coefficient (CCC) is considered under a three-level design where multiple raters rate every subject over time, and each rater is rating every subject multiple times at each measuring time point. The ratings can be discrete or continuous. A methodology is developed for the interval estimation of the CCC based on a suitable linearization of the model along with an adaptation of the fiducial inference approach. The resulting confidence intervals have satisfactory coverage probabilities and shorter expected widths compared to the interval based on Fisher Z-transformation, even under moderate sample sizes. Two real applications available in the literature are discussed. The first application is based on a clinical trial to determine if various treatments are more effective than a placebo for treating knee pain associated with osteoarthritis. The CCC was used to assess agreement among the manual measurements of the joint space widths on plain radiographs by two raters, and the computer-generated measurements of digitalized radiographs. The second example is on a corticospinal tractography, and the CCC was once again applied in order to evaluate the agreement between a well-trained technologist and a neuroradiologist regarding the measurements of fiber number in both the right and left corticospinal tracts. Other relevant applications of our general approach are highlighted in many areas including artificial intelligence.
翻译:本文在三级设计框架下推广了经典一致性相关系数(CCC),其中多位评估者随时间对每位受试者进行评分,且每位评估者在每个测量时间点对每位受试者进行多次评分。评分可为离散型或连续型。通过结合模型线性化方法与基准推断思想,建立了一种适用于CCC区间估计的方法论。相较于基于Fisher Z变换的置信区间,所得置信区间在中等样本量下仍具有更优的覆盖概率与更短的期望宽度。文中讨论了文献中的两项实际应用:第一项基于评估骨关节炎相关膝痛治疗效果的临床试验,使用CCC分析两位评估者对X光平片关节间隙宽度的人工测量结果与数字化X光片的计算机生成测量结果之间的一致性;第二项涉及皮质脊髓束纤维追踪研究,再次应用CCC评估训练有素的技术人员与神经放射科医师对左右皮质脊髓束纤维数量测量结果的一致性。本文提出的通用方法在人工智能等诸多领域具有广泛的应用前景。