ML APIs have greatly relieved application developers of the burden to design and train their own neural network models -- classifying objects in an image can now be as simple as one line of Python code to call an API. However, these APIs offer the same pre-trained models regardless of how their output is used by different applications. This can be suboptimal as not all ML inference errors can cause application failures, and the distinction between inference errors that can or cannot cause failures varies greatly across applications. To tackle this problem, we first study 77 real-world applications, which collectively use six ML APIs from two providers, to reveal common patterns of how ML API output affects applications' decision processes. Inspired by the findings, we propose ChameleonAPI, an optimization framework for ML APIs, which takes effect without changing the application source code. ChameleonAPI provides application developers with a parser that automatically analyzes the application to produce an abstract of its decision process, which is then used to devise an application-specific loss function that only penalizes API output errors critical to the application. ChameleonAPI uses the loss function to efficiently train a neural network model customized for each application and deploys it to serve API invocations from the respective application via existing interface. Compared to a baseline that selects the best-of-all commercial ML API, we show that ChameleonAPI reduces incorrect application decisions by 43%.
翻译:ML API极大减轻了应用开发者设计和训练自身神经网络模型的负担——如今,对图像中的物体进行分类只需一行Python代码调用API。然而,无论不同应用如何使用API输出结果,这些API都提供相同的预训练模型。这种做法可能并非最优,因为并非所有ML推理错误都会导致应用故障,而能否引发故障的推理错误差异在不同应用间差异显著。为解决此问题,我们首先研究了77个真实应用,这些应用共同使用了来自两家供应商的六个ML API,以揭示ML API输出影响应用决策过程的常见模式。受研究发现启发,我们提出ChameleonAPI,一个面向ML API的优化框架,其在无需修改应用源代码的情况下发挥作用。ChameleonAPI为应用开发者提供解析器,可自动分析应用并生成其决策过程摘要,进而设计出仅惩罚对应用关键性API输出错误的特定损失函数。ChameleonAPI利用该损失函数高效训练为每个应用定制的神经网络模型,并通过现有接口部署该模型以服务对应应用的API调用。与选择最佳通用商业ML API的基线方案相比,我们证明ChameleonAPI可将错误应用决策减少43%。