Semi-algebraic priors are ubiquitous in signal processing and machine learning. Prevalent examples include a) linear models where the signal lies in a low-dimensional subspace; b) sparse models where the signal can be represented by only a few coefficients under a suitable basis; and c) a large family of neural network generative models. In this paper, we prove a transversality theorem for semi-algebraic sets in orthogonal or unitary representations of groups: with a suitable dimension bound, a generic translate of any semi-algebraic set is transverse to the orbits of the group action. This, in turn, implies that if a signal lies in a low-dimensional semi-algebraic set, then it can be recovered uniquely from measurements that separate orbits. As an application, we consider the implications of the transversality theorem to the problem of recovering signals that are translated by random group actions from their second moment. As a special case, we discuss cryo-EM: a leading technology to constitute the spatial structure of biological molecules, which serves as our prime motivation. In particular, we derive explicit bounds for recovering a molecular structure from the second moment under a semi-algebraic prior and deduce information-theoretic implications. We also obtain information-theoretic bounds for three additional applications: factoring Gram matrices, multi-reference alignment, and phase retrieval. Finally, we deduce bounds for designing permutation invariant separators in machine learning.
翻译:半代数先验在信号处理与机器学习中普遍存在。典型示例包括:a) 信号位于低维子空间的线性模型;b) 信号可在适当基下用少量系数表示的稀疏模型;c) 一大类神经网络生成模型。本文证明了群的正交或酉表示中半代数集的横截性定理:在适当的维数约束下,任意半代数集的通用平移与群作用轨道横截。这进而意味着,若信号位于低维半代数集中,则可通过分离轨道的测量值唯一恢复。作为应用,我们探讨横截性定理对从二阶矩恢复经随机群作用平移信号问题的启示。特别地,我们讨论了冷冻电镜——作为本研究主要动机的构成生物分子空间结构的前沿技术。具体而言,我们推导了在半代数先验下从二阶矩恢复分子结构的显式界,并推演出信息论意义。此外,我们获得了三个其他应用的信息论界:格拉姆矩阵分解、多参考对齐和相位恢复。最后,我们推导了机器学习中设计置换不变分离器的理论界。