The Normalized Maximum Likelihood for Regular Non-Smooth Models: Measure-Theoretic Foundations and Geometric Sampling

The Normalized Maximum Likelihood (NML) codelength, or stochastic complexity, represents a principled criterion for universal coding. While recent coarea-based formulations provided a calculation method for smooth models, this framework collapses for the non-smooth estimators ubiquitous in modern machine learning (e.g., Lasso, Sparse SVMs). In this work, we provide a rigorous framework for computing the NML for regular path-differentiable Lipschitz (PDL) estimators. By applying classical geometric measure theory and bridging the coarea formula with conservative Jacobians, we prove that the stochastic complexity for non-smooth models is well-posed and theoretically consistent with the outputs of modern Automatic Differentiation. To compute this quantity exactly, we introduce the Propose-and-Project Metropolis-Hastings (PDL-PPMH) sampler, a geometric MCMC algorithm capable of traversing the non-differentiable level sets of the maximum likelihood estimator. We theoretically justify its components, including a stochastic tangent space proposal and a provably convergent non-smooth projection solver. We demonstrate the method's robustness by sampling from a high-dimensional Lasso posterior ($P=2000$), while simultaneously quantifying the computational scaling that governs the trade-off between exactness and mixing time. Crucially, we empirically demonstrate that our exact NML criterion provides a highly data-efficient alternative to cross-validation, achieving statistically indistinguishable predictive optima without requiring data splitting. Altogether, our work paves the way for the theoretical analysis of the NML codelength for regular non-smooth models.

翻译：最大似然归一化（NML）码长（亦称随机复杂度）是通用编码的一个原则性准则。尽管基于余面积公式的最新方法为光滑模型提供了计算方法，但该框架在现代机器学习中普遍存在的非光滑估计器（如Lasso、稀疏SVM）中失效。本研究为计算正则路径可微Lipschitz（PDL）估计器的NML提供了严格框架。通过应用经典几何测度论并将余面积公式与保守雅可比矩阵相衔接，我们证明了非光滑模型的随机复杂度是良定义的，且与当代自动微分的输出在理论上一致。为精确计算该量，我们提出了提议-投影Metropolis-Hastings（PDL-PPMH）采样器——一种能够遍历最大似然估计器不可微水平集的几何MCMC算法。我们从理论上论证了其组成部分，包括随机切空间提议和可证明收敛的非光滑投影求解器。通过从高维Lasso后验（$P=2000$）中采样，同时量化支配精确性与混合时间权衡的计算标度，我们展示了该方法的鲁棒性。关键在于，实验证明我们精确的NML准则提供了比交叉验证更高效的数据利用替代方案，无需数据分割即可获得统计上不可区分的预测最优值。综上所述，本研究为正则非光滑模型的NML码长理论分析铺平了道路。