Flexible modeling of how an entire distribution changes with covariates is an important yet challenging generalization of mean-based regression that has seen growing interest over the past decades in both the statistics and machine learning literature. This review outlines selected state-of-the-art statistical approaches to distributional regression, complemented with alternatives from machine learning. Topics covered include the similarities and differences between these approaches, extensions, properties and limitations, estimation procedures, and the availability of software. In view of the increasing complexity and availability of large-scale data, this review also discusses the scalability of traditional estimation methods, current trends, and open challenges. Illustrations are provided using data on childhood malnutrition in Nigeria and Australian electricity prices.
翻译:分布回归提供了对整个分布如何随协变量变化的灵活建模,这是对基于均值的回归的重要而富有挑战性的推广,近几十年来在统计学和机器学习文献中均引起了越来越多的关注。本文综述了分布回归领域精选的若干前沿统计方法,并辅以机器学习中的替代方案。涵盖的主题包括这些方法之间的异同、扩展、性质与局限性、估计过程以及软件的可用性。鉴于大规模数据日益增长的复杂性和可获取性,本文还讨论了传统估计方法的可扩展性、当前趋势以及尚待解决的挑战。通过尼日利亚儿童营养不良数据和澳大利亚电价数据提供了示例说明。