The quality of generalized linear models (GLMs), frequently used by insurance companies, depends on the choice of interacting variables. The search for interactions is time-consuming, especially for data sets with a large number of variables, depends much on expert judgement of actuaries, and often relies on visual performance indicators. Therefore, we present an approach to automating the process of finding interactions that should be added to GLMs to improve their predictive power. Our approach relies on neural networks and a model-specific interaction detection method, which is computationally faster than the traditionally used methods like Friedman H-Statistic or SHAP values. In numerical studies, we provide the results of our approach on artificially generated data as well as open-source data.
翻译:广义线性模型(GLMs)在保险公司中被广泛使用,其质量取决于交互变量的选择。交互变量的搜索过程耗时较长,尤其当数据集包含大量变量时,不仅高度依赖精算师的专业判断,而且往往需要借助可视化性能指标。为此,我们提出一种自动化方法,旨在识别应纳入GLM以提升预测能力的交互项。该方法基于神经网络与一种模型特定的交互检测技术,其计算速度优于传统方法(如Friedman H统计量或SHAP值)。通过数值实验,我们在人工生成数据与开源数据上验证了该方法的有效性。