Image matting requires high-quality pixel-level human annotations to support the training of a deep model in recent literature. Whereas such annotation is costly and hard to scale, significantly holding back the development of the research. In this work, we make the first attempt towards addressing this problem, by proposing a self-supervised pre-training approach that can leverage infinite numbers of data to boost the matting performance. The pre-training task is designed in a similar manner as image matting, where random trimap and alpha matte are generated to achieve an image disentanglement objective. The pre-trained model is then used as an initialisation of the downstream matting task for fine-tuning. Extensive experimental evaluations show that the proposed approach outperforms both the state-of-the-art matting methods and other alternative self-supervised initialisation approaches by a large margin. We also show the robustness of the proposed approach over different backbone architectures. The code and models will be publicly available.
翻译:图像抠图需要高质量像素级人工标注来支持近年文献中深度学习模型的训练。然而,此类标注成本高昂且难以大规模扩展,严重阻碍了该研究领域的发展。本文首次尝试解决这一问题,提出一种自监督预训练方法,可利用无限量数据提升抠图性能。该预训练任务以与图像抠图类似的方式设计,通过生成随机三分图与alpha遮罩来实现图像解耦目标。预训练模型随后被用作下游抠图任务微调的初始化参数。大量实验评估表明,所提方法在性能上大幅优于当前最先进的抠图方法及其他替代性自监督初始化方法。我们还验证了该方法在不同骨干网络架构上的鲁棒性。代码与模型将公开发布。