Finite mixture models are widely used for unsupervised learning, but maximum likelihood estimation via EM suffers from degeneracy as components collapse. We introduce transcendental regularization, a penalized likelihood framework with analytic barrier functions that prevent degeneracy while maintaining asymptotic efficiency. The resulting Transcendental Algorithm for Mixtures of Distributions (TAMD) offers strong theoretical guarantees: identifiability, consistency, and robustness. Empirically, TAMD successfully stabilizes estimation and prevents collapse, yet achieves only modest improvements in classification accuracy-highlighting fundamental limits of mixture models for unsupervised learning in high dimensions. Our work provides both a novel theoretical framework and an honest assessment of practical limitations, implemented in an open-source R package.
翻译:有限混合模型广泛应用于无监督学习,但通过EM算法进行的最大似然估计会因分量坍缩而出现退化问题。本文提出超越正则化,这是一种带有解析屏障函数的惩罚似然框架,可在保持渐近效率的同时防止退化。由此产生的分布混合超越算法(TAMD)具有坚实的理论保证:可识别性、一致性和鲁棒性。实证研究表明,TAMD能有效稳定估计并防止坍缩,但在分类准确率上仅取得有限改进——这凸显了高维无监督学习中混合模型的基本局限性。我们的工作既提供了新颖的理论框架,也对实际局限进行了客观评估,相关方法已在开源R包中实现。