Domain-Aware Continual Zero-Shot Learning

Modern visual systems have a wide range of potential applications in vision tasks for natural science research, such as aiding in species discovery, monitoring animals in the wild, and so on. However, real-world vision tasks may experience changes in environmental conditions, leading to shifts in how captured images are presented. To address this issue, we introduce Domain-Aware Continual Zero-Shot Learning (DACZSL), a task to recognize images of unseen categories in continuously changing domains. Accordingly, we propose a Domain-Invariant Network (DIN) to learn factorized features for shifting domains and improved textual representation for unseen classes. DIN continually learns a global shared network for domain-invariant and task-invariant features, and per-task private networks for task-specific features. Furthermore, we enhance the dual network with class-wise learnable prompts to improve class-level text representation, thereby improving zero-shot prediction of future unseen classes. To evaluate DACZSL, we introduce two benchmarks, DomainNet-CZSL and iWildCam-CZSL. Our results show that DIN significantly outperforms existing baselines by over 5% in harmonic accuracy and over 1% in backward transfer and achieves a new SoTA.

翻译：现代视觉系统在自然科学研究中的视觉任务中具有广泛的应用潜力，例如辅助物种发现、野外动物监测等。然而，现实世界中的视觉任务可能会经历环境条件的变化，导致捕获图像呈现方式发生改变。为解决这一问题，我们引入了领域感知的持续零样本学习（DACZSL），该任务旨在识别持续变化领域中未见类别的图像。为此，我们提出了一种域不变网络（DIN），用于学习分解特征以适应变化的领域，并改进未见类别的文本表示。DIN持续学习一个全局共享网络以提取域不变和任务不变特征，以及每任务私有网络以提取任务特定特征。此外，我们通过类级可学习提示增强双网络，以改善类级文本表示，从而提升对未来未见类别的零样本预测性能。为评估DACZSL，我们引入了两个基准数据集：DomainNet-CZSL和iWildCam-CZSL。实验结果表明，DIN在调和准确率上显著优于现有基线方法超过5%，在后向迁移中超过1%，并达到了新的最先进水平（SoTA）。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日