Many statistical problems involve optimization over a discrete parameter space having an unknown dimension. In such settings, gradient-based methods often fail due to the non-differentiability of the objective function or a non-convex or massive search space with an objective function having many local maxima/minima. This paper presents GAReg, a unified genetic algorithm package that handles discrete optimization regression problems, which works well when standard algorithms are unjustified. GAReg provides a compact chromosome representation supporting optimal knot placement for regression splines, best-subset regression variable selection, and related problems. The package allows for uniform initialization, constraint-preserving crossover and mutation, steady-state replacement, and an optional island-model parallelization. GAReg efficiently searches high-dimensional model spaces, providing near-optimal solutions in settings where exhaustive enumeration or integer or dynamic programming approaches are infeasible.
翻译:许多统计问题涉及在未知维度的离散参数空间上进行优化。在此类情形下,基于梯度的方法常因目标函数的不可微性、非凸性或搜索空间巨大且目标函数具有多个局部极大/极小值而失效。本文提出GAReg,一个统一的遗传算法软件包,专门处理离散优化回归问题,在标准算法不适用时表现良好。GAReg采用紧凑的染色体表示方法,支持回归样条的最优节点放置、最优子集回归变量选择及相关问题。该软件包支持均匀初始化、约束保持交叉与变异、稳态替换及可选的岛屿模型并行化。GAReg能够高效搜索高维模型空间,在穷举枚举、整数规划或动态规划方法不可行的情况下提供接近最优的解。