Models with fewer parameters are often easier to interpret and more robust. Parsimony can be achieved through optimizing objectives like the AIC or BIC, which are functions of the the number of free parameters in the model. Optimizing this discrete objective is a challenge, often relying on discrete optimization. We construct smooth functions with optima that reach the same optima of these objectives but permit continuous rather than discrete optimization, relieving some selection burden. Proofs of convergence are provided and a novel method of clustering through explicit overparamterization shows promising results.
翻译:参数较少的模型通常更易于解释且更稳健。通过优化AIC或BIC等目标函数(这些函数取决于模型中自由参数的数量)可以实现简约性。但优化这类离散目标函数具有挑战性,通常需要依赖离散优化方法。我们构造了与原目标函数具有相同最优解的平滑函数,从而将离散优化转化为连续优化,减轻了部分模型选择负担。本文提供了收敛性证明,并展示了一种通过显式超参数化实现聚类的新方法,该方法取得了令人鼓舞的结果。