It is desirable in many multi-objective machine learning applications, such as multi-task learning with conflicting objectives and multi-objective reinforcement learning, to find a Pareto solution that can match a given preference of a decision maker. These problems are often large-scale with available gradient information but cannot be handled very well by the existing algorithms. To tackle this critical issue, this paper proposes a novel predict-and-correct framework for locating a Pareto solution that fits the preference of a decision maker. In the proposed framework, a constraint function is introduced in the search progress to align the solution with a user-specific preference, which can be optimized simultaneously with multiple objective functions. Experimental results show that our proposed method can efficiently find a particular Pareto solution under the demand of a decision maker for standard multiobjective benchmark, multi-task learning, and multi-objective reinforcement learning problems with more than thousands of decision variables. Code is available at: https://github.com/xzhang2523/pmgda. Our code is current provided in the pgmda.rar attached file and will be open-sourced after publication.}
翻译:在许多多目标机器学习应用中(例如具有冲突目标的多任务学习和多目标强化学习),找到能够匹配决策者给定偏好的帕累托解是理想目标。这类问题通常规模较大且具备梯度信息,但现有算法难以有效处理。针对这一关键问题,本文提出了一种新颖的“预测-校正”框架,用于定位符合决策者偏好的帕累托解。在该框架中,通过在搜索过程中引入约束函数,使解与用户特定偏好对齐,该约束函数可与多个目标函数同时优化。实验结果表明,我们提出的方法能够高效地为标准多目标基准测试、多任务学习以及包含数千个决策变量的多目标强化学习问题找到符合决策者需求的特定帕累托解。代码已开源至:https://github.com/xzhang2523/pmgda。当前代码以附件pgmda.rar形式提供,发表后将在GitHub上完全公开。