Modern metrics for generative learning like Fr\'echet Inception Distance (FID) and DINOv2-Fr\'echet Distance (FD-DINOv2) demonstrate impressive performance. However, they suffer from various shortcomings, like a bias towards specific generators and datasets. To address this problem, we propose the Fr\'echet Wavelet Distance (FWD) as a domain-agnostic metric based on the Wavelet Packet Transform ($W_p$). FWD provides a sight across a broad spectrum of frequencies in images with a high resolution, preserving both spatial and textural aspects. Specifically, we use $W_p$ to project generated and real images to the packet coefficient space. We then compute the Fr\'echet distance with the resultant coefficients to evaluate the quality of a generator. This metric is general-purpose and dataset-domain agnostic, as it does not rely on any pre-trained network, while being more interpretable due to its ability to compute Fr\'echet distance per packet, enhancing transparency. We conclude with an extensive evaluation of a wide variety of generators across various datasets that the proposed FWD can generalize and improve robustness to domain shifts and various corruptions compared to other metrics.
翻译:现代生成学习评估指标(如Fréchet Inception Distance (FID) 和 DINOv2-Fréchet Distance (FD-DINOv2))展现出卓越性能,但仍存在诸多缺陷,例如对特定生成器与数据集的偏向性。为解决此问题,我们提出基于小波包变换($W_p$)的领域无关指标——Fréchet小波距离(FWD)。FWD能够以高分辨率覆盖图像的宽频带频谱,同时保留空间与纹理特征。具体而言,我们利用$W_p$将生成图像与真实图像投影至小波包系数空间,随后基于所得系数计算Fréchet距离以评估生成器质量。该指标具有通用性与领域无关性,因其不依赖任何预训练网络;同时通过支持逐包计算Fréchet距离,增强了可解释性与透明度。最终,我们在多种数据集上对各类生成器进行广泛评估,结果表明相较于现有指标,所提出的FWD能够泛化至不同领域,并提升对领域偏移及多种干扰的鲁棒性。