Many machine learning applications encounter a situation where model providers are required to further refine the previously trained model so as to gratify the specific need of local users. This problem is reduced to the standard model tuning paradigm if the target data is permissibly fed to the model. However, it is rather difficult in a wide range of practical cases where target data is not shared with model providers but commonly some evaluations about the model are accessible. In this paper, we formally set up a challenge named \emph{Earning eXtra PerformancE from restriCTive feEDdbacks} (EXPECTED) to describe this form of model tuning problems. Concretely, EXPECTED admits a model provider to access the operational performance of the candidate model multiple times via feedback from a local user (or a group of users). The goal of the model provider is to eventually deliver a satisfactory model to the local user(s) by utilizing the feedbacks. Unlike existing model tuning methods where the target data is always ready for calculating model gradients, the model providers in EXPECTED only see some feedbacks which could be as simple as scalars, such as inference accuracy or usage rate. To enable tuning in this restrictive circumstance, we propose to characterize the geometry of the model performance with regard to model parameters through exploring the parameters' distribution. In particular, for the deep models whose parameters distribute across multiple layers, a more query-efficient algorithm is further tailor-designed that conducts layerwise tuning with more attention to those layers which pay off better. Our theoretical analyses justify the proposed algorithms from the aspects of both efficacy and efficiency. Extensive experiments on different applications demonstrate that our work forges a sound solution to the EXPECTED problem.
翻译:许多机器学习应用面临这样一种场景:模型提供者需要进一步优化先前训练的模型,以满足本地用户的特定需求。如果目标数据被允许输入模型,该问题可简化为标准模型微调范式。然而,在大量实际案例中,目标数据并未与模型提供者共享,但通常可以获取关于模型的一些评估结果,这使得问题变得相当困难。本文正式构建了一个名为“从限制性反馈中获取额外性能”(EXPECTED)的挑战,用于描述这类模型微调问题。具体而言,EXPECTED允许模型提供者通过本地用户(或用户群体)的反馈多次访问候选模型的运行性能。模型提供者的目标是通过利用这些反馈最终向本地用户提供令人满意的模型。与现有模型微调方法(其中目标数据始终可用于计算模型梯度)不同,EXPECTED中的模型提供者仅能看到一些反馈——这些反馈可能简单至标量,例如推理准确率或使用率。为了在这种限制性环境下实现微调,我们提出通过探索参数分布来刻画模型性能相对于模型参数的几何结构。特别地,针对参数分布多层深度模型,我们进一步设计了一种更具查询效率的算法,该算法对表现更好的层给予更多关注,进行逐层微调。我们的理论分析从有效性和效率两方面验证了所提算法。跨不同应用的大量实验表明,我们的工作为EXPECTED问题提供了可靠的解决方案。