With the increasing interest and rapid development of methods for Ultra-High Resolution (UHR) segmentation, a large-scale benchmark covering a wide range of scenes with full fine-grained dense annotations is urgently needed to facilitate the field. To this end, the URUR dataset is introduced, in the meaning of Ultra-High Resolution dataset with Ultra-Rich Context. As the name suggests, URUR contains amounts of images with high enough resolution (3,008 images of size 5,120x5,120), a wide range of complex scenes (from 63 cities), rich-enough context (1 million instances with 8 categories) and fine-grained annotations (about 80 billion manually annotated pixels), which is far superior to all the existing UHR datasets including DeepGlobe, Inria Aerial, UDD, etc.. Moreover, we also propose WSDNet, a more efficient and effective framework for UHR segmentation especially with ultra-rich context. Specifically, multi-level Discrete Wavelet Transform (DWT) is naturally integrated to release computation burden while preserve more spatial details, along with a Wavelet Smooth Loss (WSL) to reconstruct original structured context and texture with a smooth constrain. Experiments on several UHR datasets demonstrate its state-of-the-art performance. The dataset is available at https://github.com/jankyee/URUR.
翻译:随着超高分辨率(UHR)分割方法的日益关注和快速发展,迫切需要涵盖广泛场景并具有完整精细密集标注的大规模基准数据集来推动该领域的发展。为此,我们提出了URUR数据集,其含义为具有超丰富上下文的超高分辨率数据集。顾名思义,URUR包含大量分辨率足够高的图像(3,008张尺寸为5,120×5,120的图像)、广泛的复杂场景(来自63个城市)、足够丰富的上下文(8个类别的100万个实例)以及精细标注(约800亿个手动标注像素),这远远优于现有的所有UHR数据集,包括DeepGlobe、Inria Aerial、UDD等。此外,我们还提出了WSDNet,这是一种更高效、更有效的UHR分割框架,尤其适用于超丰富上下文场景。具体而言,该框架自然集成了多级离散小波变换(DWT)以减轻计算负担,同时保留更多空间细节,并引入小波平滑损失(WSL)以通过平滑约束重建原始结构化的上下文和纹理。在多个UHR数据集上的实验证明了其最先进的性能。该数据集可在https://github.com/jankyee/URUR获取。