Deep quantization methods have shown high efficiency on large-scale image retrieval. However, current models heavily rely on ground-truth information, hindering the application of quantization in label-hungry scenarios. A more realistic demand is to learn from inexhaustible uploaded images that are associated with informal tags provided by amateur users. Though such sketchy tags do not obviously reveal the labels, they actually contain useful semantic information for supervising deep quantization. To this end, we propose Weakly-Supervised Deep Hyperspherical Quantization (WSDHQ), which is the first work to learn deep quantization from weakly tagged images. Specifically, 1) we use word embeddings to represent the tags and enhance their semantic information based on a tag correlation graph. 2) To better preserve semantic information in quantization codes and reduce quantization error, we jointly learn semantics-preserving embeddings and supervised quantizer on hypersphere by employing a well-designed fusion layer and tailor-made loss functions. Extensive experiments show that WSDHQ can achieve state-of-art performance on weakly-supervised compact coding. Code is available at https://github.com/gimpong/AAAI21-WSDHQ.
翻译:深度量化方法在大规模图像检索中展现了高效性。然而,当前模型高度依赖真实标签信息,限制了量化在标签稀缺场景下的应用。更现实的需求是从由业余用户提供的非正式标签关联的海量上传图像中学习。尽管这类粗略标签未明确揭示类别,但其实际上包含了可用于监督深度量化的有效语义信息。为此,我们提出弱监督深度超球量化(WSDHQ),这是首个从弱标签图像中学习深度量化的方法。具体而言:1)使用词嵌入表示标签,并基于标签相关图增强其语义信息;2)通过设计融合层与定制损失函数,在超球面上联合学习语义保持嵌入与监督量化器,以更好地在量化编码中保存语义信息并降低量化误差。大量实验表明,WSDHQ在弱监督紧致编码任务上达到了领先性能。代码开源于https://github.com/gimpong/AAAI21-WSDHQ。