Advancements in AI image generation, particularly diffusion models, have progressed rapidly. However, the absence of an established framework for quantifying the reliability of AI-generated images hinders their use in critical decision-making tasks, such as medical image diagnosis. In this study, we address the task of detecting anomalous regions in medical images using diffusion models and propose a statistical method to quantify the reliability of the detected anomalies. The core concept of our method involves a selective inference framework, wherein statistical tests are conducted under the condition that the images are produced by a diffusion model. With our approach, the statistical significance of anomaly detection results can be quantified in the form of a $p$-value, enabling decision-making with controlled error rates, as is standard in medical practice. We demonstrate the theoretical soundness and practical effectiveness of our statistical test through numerical experiments on both synthetic and brain image datasets.
翻译:人工智能图像生成技术,特别是扩散模型,已取得快速进展。然而,由于缺乏量化AI生成图像可靠性的成熟框架,阻碍了其在医学图像诊断等关键决策任务中的应用。本研究针对使用扩散模型检测医学图像中异常区域的任务,提出了一种统计方法来量化所检测异常的可靠性。我们方法的核心思想在于选择性推断框架,即在图像由扩散模型生成的条件下进行统计检验。通过我们的方法,异常检测结果的统计显著性可以以$p$值的形式进行量化,从而实现误差率可控的决策,这符合医疗实践中的标准。我们通过在合成数据集和脑部图像数据集上的数值实验,证明了所提统计检验方法的理论合理性与实际有效性。