Real-time perception requires planned resource utilization. Computational planning in real-time perception is governed by two considerations -- accuracy and latency. There exist run-time decisions (e.g. choice of input resolution) that induce tradeoffs affecting performance on a given hardware, arising from intrinsic (content, e.g. scene clutter) and extrinsic (system, e.g. resource contention) characteristics. Earlier runtime execution frameworks employed rule-based decision algorithms and operated with a fixed algorithm latency budget to balance these concerns, which is sub-optimal and inflexible. We propose Chanakya, a learned approximate execution framework that naturally derives from the streaming perception paradigm, to automatically learn decisions induced by these tradeoffs instead. Chanakya is trained via novel rewards balancing accuracy and latency implicitly, without approximating either objectives. Chanakya simultaneously considers intrinsic and extrinsic context, and predicts decisions in a flexible manner. Chanakya, designed with low overhead in mind, outperforms state-of-the-art static and dynamic execution policies on public datasets on both server GPUs and edge devices.
翻译:实时感知需要对资源利用进行规划。实时感知中的计算规划受两个因素制约:精度和延迟。存在一些运行时决策(例如输入分辨率的选择)会引入权衡,从而影响给定硬件上的性能,这些决策源于内在因素(内容,例如场景杂乱度)和外在因素(系统,例如资源争用)。早期的运行时执行框架采用基于规则的决策算法,并在固定算法延迟预算下运行以平衡这些考量,这种方法次优且缺乏灵活性。我们提出Chanakya,一种从流式感知范式自然衍生的学习型近似执行框架,能够自动学习由这些权衡所引发的决策。Chanakya通过新颖的奖励机制进行训练,隐式地平衡精度与延迟,而无需对任一目标进行近似。Chanakya同时考虑内在和外在上下文,并以灵活的方式预测决策。Chanakya的设计注重低开销,在服务器GPU和边缘设备上的公开数据集上均优于最先进的静态和动态执行策略。