Generative virtual staining (VS) models for high-throughput screening (HTS) can provide an estimated posterior distribution of possible biological feature values for each input and cell. However, when evaluating a VS model, the true posterior is unavailable. Existing evaluation protocols only check the accuracy of the marginal distribution over the dataset rather than the predicted posteriors. We introduce information gain (IG) as a cell-wise evaluation framework that enables direct assessment of predicted posteriors. IG is a strictly proper scoring rule and comes with a sound theoretical motivation allowing for interpretability, and for comparing results across models and features. We evaluate diffusion- and GAN-based models on an extensive HTS dataset using IG and other metrics and show that IG can reveal substantial performance differences other metrics cannot.
翻译:用于高通量筛选的生成式虚拟染色模型能够为每个输入和细胞提供可能生物特征值的估计后验分布。然而,在评估虚拟染色模型时,真实后验分布并不可得。现有评估方案仅检查数据集上边缘分布的准确性,而非预测后验分布。我们引入信息增益作为细胞层面的评估框架,能够直接评估预测后验分布。信息增益是一种严格适当评分规则,具有可靠的理论基础,可实现结果可解释性,并支持跨模型和跨特征的比较。我们使用信息增益及其他指标在广泛的高通量筛选数据集上评估基于扩散和生成对抗网络的模型,结果表明信息增益能够揭示其他指标无法检测的显著性能差异。