Parameter-Expanded ECME Algorithms for Logistic and Penalized Logistic Regression

Parameter estimation in logistic regression is a well-studied problem with the Newton-Raphson method being one of the most prominent optimization techniques used in practice. A number of monotone optimization methods including minorization-maximization (MM) algorithms, expectation-maximization (EM) algorithms and related variational Bayes approaches offer a family of useful alternatives guaranteed to increase the logistic regression likelihood at every iteration. In this article, we propose a modified version of a logistic regression EM algorithm which can substantially improve computationally efficiency while preserving the monotonicity of EM and the simplicity of the EM parameter updates. By introducing an additional latent parameter and selecting this parameter to maximize the penalized observed-data log-likelihood at every iteration, our iterative algorithm can be interpreted as a parameter-expanded expectation-condition maximization either (ECME) algorithm, and we demonstrate how to use the parameter-expanded ECME with an arbitrary choice of weights and penalty function. In addition, we describe a generalized version of our parameter-expanded ECME algorithm that can be tailored to the challenges encountered in specific high-dimensional problems, and we study several interesting connections between this generalized algorithm and other well-known methods. Performance comparisons between our method, the EM algorithm, and several other optimization methods are presented using a series of simulation studies based upon both real and synthetic datasets.

翻译：逻辑回归中的参数估计是一个研究完善的问题，其中牛顿-拉夫森方法是实际应用中最突出的优化技术之一。包括极小化-最大化算法、期望最大化算法及相关变分贝叶斯方法在内的一系列单调优化方法提供了一类有用的替代方案，能够保证每次迭代时逻辑回归似然函数值的提升。本文提出了一种改进的逻辑回归EM算法版本，该版本在保持EM单调性和参数更新简单性的同时，能显著提高计算效率。通过引入一个额外潜变量参数，并在每次迭代中选择该参数以最大化惩罚观测数据对数似然函数，我们的迭代算法可被解释为一种参数扩展的期望条件最大化算法。我们展示了如何将参数扩展的ECME与任意选择的权重和惩罚函数结合使用。此外，我们描述了参数扩展ECME算法的广义版本，该版本可针对特定高维问题中的挑战进行定制，并研究了该广义算法与其他知名方法之间的若干有趣联系。基于真实和合成数据集的模拟研究，我们比较了所提方法、EM算法及其他几种优化方法的性能表现。

相关内容

逻辑回归

关注 318

逻辑回归（也称“对数几率回归”）（英语：Logistic regression 或logit regression），即逻辑模型（英语：Logit model，也译作“评定模型”、“分类评定模型”）是离散选择法模型之一，属于多重变量分析范畴，是社会学、生物统计学、临床、数量心理学、计量经济学、市场营销等统计实证分析的常用方法。在统计学中，logistic模型(或logit模型)用于对存在的某个类或事件的概率建模，例如通过/失败、赢/输、活着/死了或健康/生病。这可以扩展到建模若干类事件，如确定一个图像是否包含猫、狗、狮子等。图像中检测到的每个物体的概率都在0到1之间，其和为1。

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

66+阅读 · 2023年2月15日

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日