Hybrid electric vehicles (HEVs) are becoming increasingly popular because they can better combine the working characteristics of internal combustion engines and electric motors. However, the minimum fuel consumption of an HEV for a battery electrical balance case under a specific assembly condition and a specific speed curve still needs to be clarified in academia and industry. Regarding this problem, this work provides the mathematical expression of constrained optimal fuel consumption (COFC) from the perspective of constrained reinforcement learning (CRL) for the first time globally. Also, two mainstream approaches of CRL, constrained variational policy optimization (CVPO) and Lagrangian-based approaches, are utilized for the first time to obtain the vehicle's minimum fuel consumption under the battery electrical balance condition. We conduct case studies on the well-known Prius TOYOTA hybrid system (THS) under the NEDC condition; we give vital steps to implement CRL approaches and compare the performance between the CVPO and Lagrangian-based approaches. Our case study found that CVPO and Lagrangian-based approaches can obtain the lowest fuel consumption while maintaining the SOC balance constraint. The CVPO approach converges stable, but the Lagrangian-based approach can obtain the lowest fuel consumption at 3.95 L/100km, though with more significant oscillations. This result verifies the effectiveness of our proposed CRL approaches to the COFC problem.
翻译:混合动力电动汽车(HEV)因其能更好地结合内燃机与电动机的工作特性而日益普及。然而,在特定装配条件和特定速度曲线下,电池电量平衡工况时HEV的最小燃油消耗问题在学术界和工业界仍有待阐明。针对该问题,本研究首次从约束强化学习(CRL)角度给出了约束最优燃油消耗(COFC)的数学表达式。同时,首次采用两种主流CRL方法——约束变分策略优化(CVPO)和基于拉格朗日的方法,来获取电池电量平衡条件下车辆的最小燃油消耗。我们以著名的普锐斯丰田混合动力系统(THS)在NEDC工况下进行案例研究,给出了实施CRL方法的关键步骤,并比较了CVPO与基于拉格朗日方法的性能。案例研究发现,CVPO和基于拉格朗日的方法均能在维持SOC平衡约束的同时获得最低燃油消耗。CVPO方法收敛稳定,而基于拉格朗日的方法虽存在更显著振荡,但可实现最低燃油消耗3.95 L/100km。这一结果验证了我们提出的CRL方法对COFC问题的有效性。