Pareto front profiling in multi-objective optimization (MOO), i.e. finding a diverse set of Pareto optimal solutions, is challenging, especially with expensive objectives like neural network training. Typically, in MOO neural architecture search (NAS), we aim to balance performance and hardware metrics across devices. Prior NAS approaches simplify this task by incorporating hardware constraints into the objective function, but profiling the Pareto front necessitates a computationally expensive search for each constraint. In this work, we propose a novel NAS algorithm that encodes user preferences for the trade-off between performance and hardware metrics, and yields representative and diverse architectures across multiple devices in just one search run. To this end, we parameterize the joint architectural distribution across devices and multiple objectives via a hypernetwork that can be conditioned on hardware features and preference vectors, enabling zero-shot transferability to new devices. Extensive experiments with up to 19 hardware devices and 3 objectives showcase the effectiveness and scalability of our method. Finally, we show that, without extra costs, our method outperforms existing MOO NAS methods across a broad range of qualitatively different search spaces and datasets, including MobileNetV3 on ImageNet-1k, an encoder-decoder transformer space for machine translation and a decoder-only transformer space for language modelling.
翻译:多目标优化(MOO)中的帕累托前沿刻画,即寻找一组多样化的帕累托最优解,是一项具有挑战性的任务,尤其是在目标函数计算成本高昂(如神经网络训练)的情况下。通常,在多目标神经架构搜索(NAS)中,我们的目标是在不同设备上平衡性能与硬件指标。现有的NAS方法通过将硬件约束纳入目标函数来简化此任务,但刻画帕累托前沿需要对每个约束进行计算成本高昂的搜索。在本研究中,我们提出了一种新颖的NAS算法,该算法编码了用户对性能与硬件指标之间权衡的偏好,并能在单次搜索运行中为多个设备生成具有代表性和多样性的架构。为此,我们通过一个超网络对跨设备和多目标的联合架构分布进行参数化,该超网络可以以硬件特征和偏好向量为条件,从而实现对新设备的零样本可迁移性。在多达19种硬件设备和3个目标上的大量实验展示了我们方法的有效性和可扩展性。最后,我们证明,在无需额外成本的情况下,我们的方法在多种性质不同的搜索空间和数据集上均优于现有的多目标NAS方法,包括ImageNet-1k上的MobileNetV3、用于机器翻译的编码器-解码器Transformer空间以及用于语言建模的仅解码器Transformer空间。