Semi-algebraic priors are ubiquitous in signal processing and machine learning. Prevalent examples include a) linear models where the signal lies in a low-dimensional subspace; b) sparse models where the signal can be represented by only a few coefficients under a suitable basis; and c) a large family of neural network generative models. In this paper, we prove a transversality theorem for semi-algebraic sets in orthogonal or unitary representations of groups: with a suitable dimension bound, a generic translate of any semi-algebraic set is transverse to the orbits of the group action. This, in turn, implies that if a signal lies in a low-dimensional semi-algebraic set, then it can be recovered uniquely from measurements that separate orbits. As an application, we consider the implications of the transversality theorem to the problem of recovering signals that are translated by random group actions from their second moment. As a special case, we discuss cryo-EM: a leading technology to constitute the spatial structure of biological molecules, which serves as our prime motivation. In particular, we derive explicit bounds for recovering a molecular structure from the second moment under a semi-algebraic prior and deduce information-theoretic implications. We also obtain information-theoretic bounds for three additional applications: factoring Gram matrices, multi-reference alignment, and phase retrieval. Finally, we deduce bounds for designing permutation invariant separators in machine learning.
翻译:半代数先验在信号处理和机器学习中普遍存在。常见例子包括:a) 信号位于低维子空间中的线性模型;b) 信号可由适当基下的少数系数表示的稀疏模型;c) 大量神经网络生成模型。本文证明了群的正交或酉表示中半代数集的横截性定理:在适当的维度约束下,任意半代数集的广义平移与群作用的轨道横截相交。这进而表明,若信号位于低维半代数集中,则可通过分离轨道的测量唯一恢复。作为应用,我们考虑该横截性定理对从二阶矩中恢复经随机群作用平移的信号问题的意义。特别地,我们讨论了冷冻电镜——一种构成生物分子空间结构的前沿技术,这也是本文的主要动机。我们推导了在半代数先验下从二阶矩恢复分子结构的显式界,并得出信息论推论。此外,我们为另外三个应用(Gram矩阵分解、多参考对齐以及相位恢复)获得了信息论界。最后,我们推导了设计机器学习中置换不变分离器的界。