Self-supervised learning (SSL) has emerged as a powerful paradigm for learning representations without labeled data. Most SSL approaches rely on strong, well-established, handcrafted data augmentations to generate diverse views for representation learning. However, designing such augmentations requires domain-specific knowledge and implicitly imposes representational invariances on the model, which can limit generalization. In this work, we propose an unsupervised representation learning method that replaces augmentations by generating views using orthonormal bases and overcomplete frames. We show that embeddings learned from orthonormal and overcomplete spaces reside on distinct manifolds, shaped by the geometric biases introduced by representing samples in different spaces. By jointly leveraging the complementary geometry of these distinct manifolds, our approach achieves superior performance without artificially increasing data diversity through strong augmentations. We demonstrate the effectiveness of our method on nine datasets across five temporal sequence tasks, where signal-specific characteristics make data augmentations particularly challenging. Without relying on augmentation-induced diversity, our method achieves performance gains of up to 15--20\% over existing self-supervised approaches. Source code: https://github.com/eth-siplab/Learning-with-FrameProjections
翻译:自监督学习已成为无需标注数据学习表示的有力范式。大多数自监督学习方法依赖强大且成熟的手工数据增强来生成多样化的视图以进行表示学习。然而,设计此类增强需要领域专业知识,并隐式地将表示不变性强加于模型,这可能限制泛化能力。在本工作中,我们提出一种无监督表示学习方法,通过使用正交基和过完备框架生成视图来替代数据增强。我们证明,从正交空间和过完备空间学习到的嵌入位于不同的流形上,这些流形的形状由在不同空间中表示样本所引入的几何偏置所决定。通过联合利用这些不同流形的互补几何特性,我们的方法无需通过强数据增强人为增加数据多样性即可实现优越性能。我们在五个时序序列任务的九个数据集上验证了该方法的有效性,这些任务中信号特定的特性使得数据增强尤为困难。在不依赖增强诱导多样性的情况下,我们的方法相比现有自监督学习方法实现了高达15-20%的性能提升。源代码:https://github.com/eth-siplab/Learning-with-FrameProjections