Traditional machine learning training is a static process that lacks real-time adaptability of hyperparameters. Popular tuning solutions during runtime involve checkpoints and schedulers. Adjusting hyper-parameters usually require the program to be restarted, wasting utilization and time, while placing unnecessary strain on memory and processors. We present LiveTune, a new framework allowing real-time parameter tuning during training through LiveVariables. Live Variables allow for a continuous training session by storing parameters on designated ports on the system, allowing them to be dynamically adjusted. Extensive evaluations of our framework show saving up to 60 seconds and 5.4 Kilojoules of energy per hyperparameter change.
翻译:传统机器学习训练是一个静态过程,缺乏超参数的实时自适应能力。运行时常用的调优方案涉及检查点(checkpoints)和调度器(schedulers)。超参数调整通常需要重启程序,这不仅浪费了计算资源与时间,还对内存和处理器造成不必要的负担。我们提出了LiveTune——一种通过LiveVariables实现训练期间实时参数调优的新型框架。LiveVariables通过将参数存储在系统指定端口上,允许进行持续的训练会话,从而实现参数的动态调整。对该框架的全面评估表明,每次超参数变更可节省多达60秒时间及5.4千焦耳能耗。