Beyond Isotropy in JEPAs: Hamiltonian Geometry and Symplectic Prediction

JEPAs often regularize one-view embeddings toward an isotropic Gaussian, implicitly baking Euclidean symmetry into the representation. We show that this is not merely a benign default. For a known structured downstream geometry $H\succ0$, the minimax and maximum-entropy covariance under a Hamiltonian energy budget is $(c/d)H^{-1}$, and Euclidean isotropy incurs a closed-form price of isotropy. More importantly, when the downstream geometry is unknown, no geometry-independent fixed marginal target is canonical: every fixed covariance shape can be maximally misaligned for some structured geometry. We further show that even oracle one-view marginals do not identify the JEPA view-to-view predictive coupling. These results suggest that the structural bias in JEPAs should enter the cross-view coupling rather than a fixed encoder marginal. We instantiate this principle with \textbf{HamJEPA}, which encodes each view as a phase-space state $(q,p)$ and predicts view-to-view transitions with a learned Hamiltonian leapfrog map, while non-isotropic scale and spectral floors prevent collapse. In a deliberately headless token protocol, HamJEPA improves over SIGReg on CIFAR-100 by $+4.89$ kNN@20 and $+3.52$ linear-probe points at 30 epochs, and by $+6.45$ kNN@20 and $+10.64$ linear-probe points at 80 epochs, while a matched MLP predictor ablation shows that the symplectic coupling is the ingredient driving the neighborhood-geometry gain. On ImageNet-100, HamJEPA-$q$ improves by $+4.82$ kNN@20 and $+7.52$ linear-probe points at 45 epochs.

翻译：JEPA方法通常将单视角嵌入向各向同性高斯分布进行正则化，隐含地将欧几里得对称性注入表征。我们证明这并非仅仅是一种无害的默认设置。对于已知的结构化下游几何$H\succ0$，在哈密顿能量预算下的极小极大和最大熵协方差为$(c/d)H^{-1}$，且欧几里得各向同性会产生闭式形式的各向同性代价。更重要的是，当下游几何未知时，不存在与几何无关的固定边际目标是规范性的：每个固定的协方差形状都可能与某些结构化几何产生最大程度的不匹配。我们进一步证明，即使是最优的单视角边际分布也无法唯一确定JEPA视角间的预测耦合。这些结果表明，JEPA中的结构性偏差应注入跨视角耦合而非固定的编码器边际分布。我们通过\textbf{HamJEPA}实例化该原则：将每个视角编码为相空间状态$(q,p)$，利用学习得到的哈密顿跳蛙图预测视角间转换，同时使用非各向同性的尺度与谱下限防止坍缩。在刻意设计的无分类头令牌协议中，HamJEPA在CIFAR-100上相比SIGReg提升$+4.89$ kNN@20和$+3.52$线性探测点（30轮训练），以及$+6.45$ kNN@20和$+10.64$线性探测点（80轮训练）；匹配的MLP预测器消融实验表明，辛耦合是驱动邻域几何增益的核心成分。在ImageNet-100上，HamJEPA-$q$在45轮训练时提升$+4.82$ kNN@20和$+7.52$线性探测点。