Feature interaction selection is a fundamental problem in commercial recommender systems. Most approaches equally enumerate all features and interactions by the same pre-defined operation under expert guidance. Their recommendation is unsatisfactory sometimes due to the following issues: (1)~They cannot ensure the learning abilities of models because their architectures are poorly adaptable to tasks and data; (2)~Useless features and interactions can bring unnecessary noise and complicate the training process. In this paper, we aim to adaptively evolve the model to select appropriate operations, features, and interactions under task guidance. Inspired by the evolution and functioning of natural organisms, we propose a novel \textsl{Cognitive EvoLutionary Learning (CELL)} framework, where cognitive ability refers to a property of organisms that allows them to react and survive in diverse environments. It consists of three stages, i.e., DNA search, genome search, and model functioning. Specifically, if we regard the relationship between models and tasks as the relationship between organisms and natural environments, interactions of feature pairs can be analogous to double-stranded DNA, of which relevant features and interactions can be analogous to genomes. Along this line, we diagnose the fitness of the model on operations, features, and interactions to simulate the survival rates of organisms for natural selection. We show that CELL can adaptively evolve into different models for different tasks and data, which enables practitioners to access off-the-shelf models. Extensive experiments on four real-world datasets demonstrate that CELL significantly outperforms state-of-the-art baselines. Also, we conduct synthetic experiments to ascertain that CELL can consistently discover the pre-defined interaction patterns for feature pairs.
翻译:特征交互选择是商业推荐系统中的基础问题。大多数方法在专家指导下通过相同的预定义操作平等枚举所有特征和交互。由于以下问题,其推荐效果有时不尽如人意:(1) 由于模型架构对任务和数据的适应性较差,无法保证模型的学习能力;(2) 无用的特征和交互会引入不必要的噪声并使训练过程复杂化。本文旨在通过任务引导自适应地进化模型,以选择合适的操作、特征和交互。受自然生物进化与功能的启发,我们提出了一种新颖的认知进化学习框架,其中认知能力指生物体在不同环境中作出反应并生存的特性。该框架包含三个阶段:DNA搜索、基因组搜索和模型功能化。具体而言,若将模型与任务的关系类比为生物体与自然环境的关系,则特征对的交互可类比为双链DNA,其中相关特征与交互可类比为基因组。基于这一思路,我们通过诊断模型在操作、特征和交互上的适应度来模拟生物体的自然选择存活率。研究表明,CELL能够针对不同任务和数据自适应地进化为不同模型,使从业者能够获得即用型模型。在四个真实数据集上的大量实验表明,CELL显著优于当前最先进的基线方法。此外,我们通过合成实验证实,CELL能够持续发现特征对之间预定义的交互模式。