Goal-Conditioned Supervised Learning for Multi-Objective Recommendation

Multi-objective learning endeavors to concurrently optimize multiple objectives using a single model, aiming to achieve high and balanced performance across these diverse objectives. However, it often involves a more complex optimization problem, particularly when navigating potential conflicts between objectives, leading to solutions with higher memory requirements and computational complexity. This paper introduces a Multi-Objective Goal-Conditioned Supervised Learning (MOGCSL) framework for automatically learning to achieve multiple objectives from offline sequential data. MOGCSL extends the conventional Goal-Conditioned Supervised Learning (GCSL) method to multi-objective scenarios by redefining goals from one-dimensional scalars to multi-dimensional vectors. The need for complex architectures and optimization constraints can be naturally eliminated. MOGCSL benefits from filtering out uninformative or noisy instances that do not achieve desirable long-term rewards. It also incorporates a novel goal-choosing algorithm to model and select "high" achievable goals for inference. While MOGCSL is quite general, we focus on its application to the next action prediction problem in commercial-grade recommender systems. In this context, any viable solution needs to be reasonably scalable and also be robust to large amounts of noisy data that is characteristic of this application space. We show that MOGCSL performs admirably on both counts. Specifically, extensive experiments conducted on real-world recommendation datasets validate its efficacy and efficiency. Also, analysis and experiments are included to explain its strength in discounting the noisier portions of training data in recommender systems.

翻译：多目标学习致力于使用单一模型同时优化多个目标，旨在实现这些不同目标的高效且均衡的性能。然而，这通常涉及更复杂的优化问题，尤其是在处理目标间潜在冲突时，会导致解决方案具有更高的内存需求和计算复杂度。本文提出了一种多目标目标条件监督学习（MOGCSL）框架，用于从离线序列数据中自动学习实现多个目标。MOGCSL通过将目标从一维标量重新定义为多维向量，将传统的目标条件监督学习（GCSL）方法扩展到多目标场景。对复杂架构和优化约束的需求得以自然消除。MOGCSL受益于过滤掉那些未获得理想长期奖励的无信息或噪声实例。它还引入了一种新颖的目标选择算法，用于建模和选择推理时可实现的“高”目标。尽管MOGCSL具有普适性，我们重点研究其在商业级推荐系统中下一动作预测问题上的应用。在此背景下，任何可行的解决方案都需要具备合理的可扩展性，并能对此应用领域特有的大量噪声数据保持鲁棒性。我们证明MOGCSL在这两方面均表现出色。具体而言，在真实世界推荐数据集上进行的大量实验验证了其有效性和效率。此外，本文还包含分析和实验，以解释其在推荐系统中对训练数据噪声部分进行有效抑制的优势。