Exponential families form the backbone of modern statistics and machine learning, but textbooks seldom derive them from first principles in an accessible way. Although minimal sufficiency and the principle of maximum entropy, originating in physics, provide core motivation, they are often presented as technical and requiring advanced prerequisites. Here, a short, self-contained derivation of exponential families based on maximum entropy is presented that is straightforward to carry out, requires only a modest background in information entropy, and avoids technicalities like constrained optimisation. Two propositions are demonstrated in this fashion: i) exponential families with a general base maximise information entropy with respect to that base subject to fixed expectations of canonical statistics, and ii) exponential families with a uniform base maximise standard information entropy under the same constraints. Maximum entropy therefore provides a principled foundation for exponential families with minimal prerequisites, highlighting the value of teaching entropy in statistics courses.
翻译:指数族构成了现代统计学和机器学习的基础,但教科书很少以易于理解的方式从基本原理推导它们。尽管源自物理学的充分最小性与最大熵原理提供了核心动机,但它们往往被呈现为技术性内容,需要掌握高级先修知识。本文提出了一种基于最大熵的指数族自含式简短推导,该推导过程直观易行,仅需信息熵的基础知识,并避免了约束优化等技术细节。我们以这种方式证明两个命题:i) 在规范统计量期望固定条件下,具有一般基测度的指数族最大化相对于该基测度的信息熵;ii) 在相同约束下,具有均匀基测度的指数族最大化标准信息熵。因此,最大熵为指数族提供了以最少先修知识为基础的原则性框架,凸显了在统计学课程中教授熵的价值。