Spatial covariance matrices of EEG signals are Symmetric Positive Definite (SPD) and lie on a Riemannian manifold, yet the theoretical connection between embedding geometry and optimization dynamics remains unexplored. We provide a formal analysis linking embedding choice to gradient conditioning and numerical stability for SPD manifolds, establishing three theoretical results: (1) BWSPD's $\sqrtκ$ gradient conditioning (vs $κ$ for Log-Euclidean) via Daleckii-Kreĭn matrices provides better gradient conditioning on high-dimensional inputs ($d \geq 22$), with this advantage reducing on low-dimensional inputs ($d \leq 8$) where eigendecomposition overhead dominates; (2) Embedding-Space Batch Normalization (BN-Embed) approximates Riemannian normalization up to $O(\varepsilon^2)$ error, yielding $+26\%$ accuracy on 56-channel ERP data but negligible effect on 8-channel SSVEP data, matching the channel-count-dependent prediction; (3) bi-Lipschitz bounds prove BWSPD tokens preserve manifold distances with distortion governed solely by the condition ratio $κ$. We validate these predictions via a unified Transformer framework comparing BWSPD, Log-Euclidean, and Euclidean embeddings within identical architecture across 1,500+ runs on three EEG paradigms (motor imagery, ERP, SSVEP; 36 subjects). Our Log-Euclidean Transformer achieves state-of-the-art performance on all datasets, substantially outperforming classical Riemannian classifiers and recent SPD baselines, while BWSPD offers competitive accuracy with similar training time.
翻译:脑电信号的空间协方差矩阵属于对称正定(SPD)矩阵,并位于黎曼流形上,然而嵌入几何与优化动力学之间的理论联系尚未得到充分探索。本文通过形式化分析将SPD流形上的嵌入选择与梯度条件数及数值稳定性相关联,建立了三项理论结果:(1)基于Daleckii-Kreĭn矩阵的BWSPD嵌入具有$\sqrtκ$梯度条件数(Log-Euclidean嵌入为$κ$),在高维输入($d \geq 22$)中展现出更优的梯度条件,而在低维输入($d \leq 8$)中该优势因特征分解开销占主导而减弱;(2)嵌入空间批量归一化(BN-Embed)以$O(\varepsilon^2)$误差逼近黎曼归一化,在56通道ERP数据上实现$+26\%$的准确率提升,但在8通道SSVEP数据上效果可忽略,这与通道数依赖的预测相符;(3)双利普希茨边界证明BWSPD令牌能保持流形距离,其失真仅受条件数比$κ$控制。我们通过统一的Transformer框架在三种脑电范式(运动想象、ERP、SSVEP;36名受试者)上进行了1500余次实验,在相同架构中比较BWSPD、Log-Euclidean和欧几里得嵌入,验证了上述理论预测。我们的Log-Euclidean Transformer在所有数据集上均达到最先进性能,显著优于经典黎曼分类器与近期SPD基线,而BWSPD在保持相近训练时间的同时提供了具有竞争力的准确率。