Accuracy, precision, and agreement statistical tests for Bland-Altman method

Background: Bland and Altman plot method is a widely cited graphical approach to assess equivalence of quantitative measurement techniques. It has been widely applied, however often misinterpreted by lacking of inferential statistical support. We aim to develop and distribute a statistical method in R in order to add robust and suitable inferential statistics of equivalence. Methods: Three nested tests based on structural regressions are proposed to assess the equivalence of structural means (accuracy), equivalence of structural variances (precision), and concordance with the structural bisector line (agreement in measurements of data pairs obtained from the same subject) to reach statistical support for the equivalence of measurement techniques. Graphical outputs illustrating these three tests were added to follow Bland and Altman's principles of easy communication. Results: Statistical p-values and robust approach by bootstrapping with corresponding graphs provide objective, robust measures of equivalence. Five pairs of data sets were analyzed in order to criticize previously published articles that applied the Bland and Altman's principles, thus showing the suitability of the present statistical approach. In one case it was demonstrated strict equivalence, three cases showed partial equivalence, and one case showed poor equivalence. Package containing open codes and data is available with installation instructions on GitHub for free distribution. Conclusions: Statistical p-values and robust approach assess the equivalence of accuracy, precision, and agreement for measurement techniques. Decomposition in three tests helps the location of any disagreement as a means to fix a new technique.

翻译：背景：Bland-Altman图法是一种被广泛引用的、用于评估定量测量技术等价性的图形化方法。该方法虽应用广泛，但因缺乏推断性统计支持而常被曲解。本研究旨在开发并发布一种基于R语言的统计方法，为等价性评估提供稳健且合适的推断性统计支持。方法：基于结构回归提出三种嵌套检验，分别评估结构均值等价性（准确性）、结构方差等价性（精密度）以及与结构平分线的一致性（同一受试者数据对测量的吻合度），从而为测量技术的等价性提供统计支持。为遵循Bland-Altman易于沟通的原则，补充了展示这三种检验的图形输出。结果：统计p值及通过自助法得到的稳健结果，结合相应图形，提供了客观、稳健的等价性度量。通过对五组数据对的分析，对既往应用Bland-Altman原则发表的文献进行了评述，从而展示了本统计方法的适用性。其中一例呈现严格等价，三例呈现部分等价，一例呈现弱等价。包含开源代码与数据的软件包已发布在GitHub上，并附带安装说明以免费分发。结论：统计p值与稳健方法可评估测量技术的准确性、精密度及一致性等价性。将评估分解为三种检验有助于定位任何不一致之处，从而为改进新技术提供依据。