Modern metrics for generative learning like Fr\'echet Inception Distance (FID) demonstrate impressive performance. However, they suffer from various shortcomings, like a bias towards specific generators and datasets. To address this problem, we propose the Fr\'echet Wavelet Distance (FWD) as a domain-agnostic metric based on Wavelet Packet Transform ($W_p$). FWD provides a sight across a broad spectrum of frequencies in images with a high resolution, along with preserving both spatial and textural aspects. Specifically, we use Wp to project generated and dataset images to packet coefficient space. Further, we compute Fr\'echet distance with the resultant coefficients to evaluate the quality of a generator. This metric is general-purpose and dataset-domain agnostic, as it does not rely on any pre-trained network while being more interpretable because of frequency band transparency. We conclude with an extensive evaluation of a wide variety of generators across various datasets that the proposed FWD is able to generalize and improve robustness to domain shift and various corruptions compared to other metrics.
翻译:现代生成学习度量指标(如Fréchet Inception Distance (FID))展现出卓越性能,但仍存在诸多缺陷,例如对特定生成器和数据集的偏好。为解决此问题,我们提出基于小波包变换($W_p$)的领域无关度量指标——Fréchet小波距离(FWD)。FWD能以高分辨率覆盖图像的宽频带频谱,同时保留空间与纹理特征。具体而言,我们利用$W_p$将生成图像与数据集图像投影至包系数空间,进而通过所得系数计算Fréchet距离以评估生成器质量。该度量指标具有通用性与领域无关性,既不依赖任何预训练网络,又因频带透明性而更具可解释性。通过对多种数据集上的各类生成器进行广泛评估,我们得出结论:相较于其他度量指标,所提出的FWD能够泛化至不同领域,并对领域偏移及各类数据退化具有更强的鲁棒性。