This study focuses on optimizing path planning for unmanned ground vehicles (UGVs) in precision agriculture using deep reinforcement learning (DRL) techniques in continuous action spaces. The research begins with a review of traditional grid-based methods, such as A* and Dijkstra's algorithms, and discusses their limitations in dynamic agricultural environments, highlighting the need for adaptive learning strategies. The study then explores DRL approaches, including Deep Q-Networks (DQN), which demonstrate improved adaptability and performance in two-dimensional simulations. Enhancements such as Double Q-Networks and Dueling Networks are evaluated to further improve decision-making. Building on these results, the focus shifts to continuous action space models, specifically Deep Deterministic Policy Gradient (DDPG) and Twin Delayed Deep Deterministic Policy Gradient (TD3), which are tested in increasingly complex environments. Experiments conducted in a three-dimensional environment using ROS and Gazebo demonstrate the effectiveness of continuous DRL algorithms in navigating dynamic agricultural scenarios. Notably, the pretrained TD3 agent achieves a 95 percent success rate in dynamic environments, demonstrating the robustness of the proposed approach in handling moving obstacles while ensuring safety for both crops and the robot.
翻译:本研究聚焦于利用连续动作空间中的深度强化学习(DRL)技术,优化精准农业中无人地面车辆(UGV)的路径规划。研究首先回顾了传统的基于网格的方法,如A*和Dijkstra算法,并讨论了其在动态农业环境中的局限性,强调了自适应学习策略的必要性。随后,本研究探讨了DRL方法,包括深度Q网络(DQN),其在二维仿真中展现出改进的适应性和性能。为提升决策能力,进一步评估了双重Q网络和竞争网络等增强技术。基于这些结果,研究重点转向连续动作空间模型,特别是深度确定性策略梯度(DDPG)和双延迟深度确定性策略梯度(TD3),并在日益复杂的环境中进行测试。利用ROS和Gazebo在三维环境中进行的实验表明,连续DRL算法在动态农业场景导航中具有显著有效性。值得注意的是,预训练的TD3智能体在动态环境中实现了95%的成功率,证明了所提方法在处理移动障碍物、同时确保作物和机器人安全方面的鲁棒性。