Out-of-distribution (OOD) detection is essential to improve the reliability of machine learning models by detecting samples that do not belong to the training distribution. Detecting OOD samples effectively in certain tasks can pose a challenge because of the substantial heterogeneity within the in-distribution (ID), and the high structural similarity between ID and OOD classes. For instance, when detecting heart views in fetal ultrasound videos there is a high structural similarity between the heart and other anatomies such as the abdomen, and large in-distribution variance as a heart has 5 distinct views and structural variations within each view. To detect OOD samples in this context, the resulting model should generalise to the intra-anatomy variations while rejecting similar OOD samples. In this paper, we introduce dual-conditioned diffusion models (DCDM) where we condition the model on in-distribution class information and latent features of the input image for reconstruction-based OOD detection. This constrains the generative manifold of the model to generate images structurally and semantically similar to those within the in-distribution. The proposed model outperforms reference methods with a 12% improvement in accuracy, 22% higher precision, and an 8% better F1 score.
翻译:分布外(OOD)检测对于提升机器学习模型的可靠性至关重要,其通过识别不属于训练数据分布的样本实现这一目标。由于分布内(ID)样本存在显著的异质性,且ID与OOD类别之间具有高度的结构相似性,有效检测特定任务中的OOD样本可能颇具挑战。例如,在胎儿超声视频中检测心脏切面时,心脏与腹部等其他解剖结构存在高度的结构相似性,同时心脏包含5种不同切面且各切面内部存在结构变异,导致分布内方差较大。为在此类场景中检测OOD样本,所构建的模型需在泛化处理解剖结构内部变异的同时,对相似OOD样本进行拒斥。本文提出双重条件扩散模型(DCDM),通过将模型条件设置为分布内类别信息与输入图像的潜在特征,实现基于重构的OOD检测。该方法约束了模型的生成流形,使其生成的图像在结构与语义层面均与分布内样本高度相似。实验表明,所提模型在准确率、精确率和F1分数上分别较基准方法提升12%、22%和8%。