Multi-agent deep reinforcement learning in the path planning problem
DOI:
https://doi.org/10.31548/energiya1(83).2026.101Keywords:
unmanned aerial vehicles, precision agriculture, multi-agent systems, artificial intelligenceAbstract
This work is devoted to multi-agent deep reinforcement learning in the path planning problem. The use of UAV swarms in precision agriculture is justified. It is shown that for the use of drone swarms it is necessary to apply artificial intelligence, in particular reinforcement learning. The task of path planning in the presence of poor quality or absence of GPS navigation is set. The use of the Multi-Agent Proximal Policy Optimization method is proposed. The results obtained showed high quality of path planning in the presence of obstacles and poor quality or absence of GPS navigation.
Recieved 2025-12-27
Recieved 2026-02-02
Accepted 2026-02-11
References
1. Atalla S, Tarapiah S, Gawanmeh A, Daradkeh M, Mukhtar H, Himeur Y, Mansoor W, Hashim KFB, Daadoo M. (2023). IoT-Enabled Precision Agriculture: Developing an Ecosystem for Optimized Crop Management. Information, 14(4):205. https://doi.org/10.3390/info14040205
2. Puente-Castro, A.; Rivero, D.; Pazos, A.; Fernandez-Blanco, E. (2022). A review of artificial intelligence applied to path planning in UAVswarms. Neural Comput. Appl. 34, 153–170. [CrossRef]
3. Iqbal, M.M.; Ali, Z.A.; Khan, R.; Shafiq, M. (2022). Motion Planning of UAV Swarm: Recent Challenges and Approaches. In Aeronautics-New Advances; IntechOpen: London, UK .
4. Zhu, X.; Liu, Z.; Yang, J. (2015). Model of collaborative UAV swarm toward coordination and control mechanisms study. Procedia Comput.Sci., 51, 493–502. [CrossRef]
5. Paulsson, M. (2017). High-Level Control of UAV Swarms with RSSI Based Position Estimation. Master’s Thesis, Lund University, Lund,Sweden.
6. Arshid, K., Krayani, A., Marcenaro, L., Gomez, D. M., & Regazzoni, C. (2025). Toward Autonomous UAV Swarm Navigation: A Review of Trajectory Design Paradigms. Sensors, 25(18), 5877. https://doi.org/10.3390/s25185877.
7. Poudel, S.; Moh, S. (2022). Task assignment algorithms for unmanned aerial vehicle networks: A comprehensive survey. Veh. Commun., 35, 100469.
8. Khatib, O. (1986). Real-time bstacle avoidance for manipulators and mobile robots. Int. J. Robot. Res., 5, 90–98.
9. Pan, Z.; Zhang, C.; Xia, Y.; Xiong, H.; Shao, X. (2022). An improved artificial potential field method for path planning and formation control of the multi-UAV systems. IEEE Trans. Circuits Syst. II Express Briefs, 69, 1129–1133.
10. Wei, R.; Xu, Z.; Wang, S.; Lv, M. (2015). Self-optimization A-star algorithm for UAV path planning based on Laguerre diagram. Syst. Eng. Electron, 37, 577–582.
11. Holland, J.H. (1975). Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence; MIT Press: Cambridge, MA, USA
12. Zhenhua, P.; Hongbin, D.; Li, D. (2021). A multilayer graph for multi-agent formation and trajectory tracking control based on MPC algorithm. IEEE Trans. Cybern, 50, 12.
13. Yolov (2022). Available online: https://github.com/ultralytics/yolov5
14. Sutton, R.S.; Barto, A.G.(2018). Reinforcement Learning, 2nd ed.; An Introduction; MIT Press: Cambridge, MA, USA.
15. Jiang, J.; Contributors, M. MARLlib (2023). Documentation: PPO Family of Algorithms. 2023. Available at: https://marllib.readthedocs.io/en/latest/algorithm/ppo_family.html
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Energy and Automation

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
All materials are disseminated under the terms of the Creative Commons Attribution 4.0 International Public License, which permits others to distribute the manuscript with proper acknowledgement of the authorship and the original publication in this journal.