Situation:

Our client was interested to understand how AI could enable the autonomous coordination of a heterogeneous fleet of vehicles (both manned and unmanned) in complex and dynamic environments. Significant communication and coordination challenges to be overcome included a lack of shared data between vehicles and the need to capture and integrate local information, which were complex and time consuming tasks for human operators.

To evaluate AI as a solution, our client commissioned DSL to develop a proof of concept (PoC) that leveraged reinforcement learning to enable autonomous coordination of the fleet. The PoC included a simulator environment, which would provide a fast and realistic platform for strategy and planning purposes. This environment would enable the evaluation and benchmarking of the AI's performance in complex and dynamic environments, as well as the testing of different coordination and communication strategies.

Ultimately, the PoC would enable the fleet to assist or act independently in critical operations such as search and rescue, interception of targets, and monitoring of large areas. The end goal was to improve operational efficiency and effectiveness, while reducing the need for human intervention in high-risk situations.

Task:

DSL developed a simulation environment that handles a large number of agents (individual vehicles that must learn and make decisions within an environment) while enabling fast simulation of complex and dynamic environments. The environment included multiple types of agents, such as ships, aeroplanes, helicopters, submarines, ground stations, and more, and utilised realistic dynamics and sensor models, such as radar and vision.

The PoC utilised multi-agent reinforcement learning with graph-based representation of the state to capture the complexity of the problem and enable efficient computation and scalability. Through extensive experimentation and optimization, DSL was able to demonstrate that the RL algorithm could enable the emergence of cooperative behaviour between agents, allowing them to achieve a broad range of objectives.

Result:

The PoC developed by DSL demonstrated the potential of autonomous coordination of a heterogeneous fleet of vehicles in complex and dynamic environments. The RL algorithm enabled agents to learn cooperative behaviours, improving the overall effectiveness and efficiency of the fleet in various missions. By leveraging the simulation environment, the client was able to test and evaluate different strategies under a broad range of scenarios.

Thanks to the scalability of our approach and design, the developed algorithm can be further improved with training in richer simulation environments and more advanced techniques, and can be adapted to various real-world scenarios, such as agriculture, maritime and aerial surveillance, border security, and disaster response. The successful PoC implementation has provided a strong foundation for future research and development in the field of autonomous fleet coordination and control.

Coordinating multi-vehicle fleets of agents with reinforcement learning

Situation:

Task:

Result: