Leveraging model-based reinforcement learning for intervention and surveying in underwater environments

Aim

To develop novel reinforcement learning algorithms that leverage modelling information.

Objectives

1.   Reduce data dependency of RL methods.
2.   Develop model-based RL algorithms for underwater vehicles.
3.   Evaluate in real underwater robotic platforms.

Description

Recent events, such as the Nord Stream 2 incident, have shown the importance of protecting critical assets in marine environments, such as underwater pipelines, underwater cables and offshore wind farms. These assets are critical infrastructures that, due to their nature, are hard to monitor and protect. The research proposed in this project can lead to the development of underwater drones capable of autonomous surveying and monitoring, with the potential to secure marine assets.

This PhD project focuses on overcoming a critical limitation in underwater robotics, the excessive training data requirements for reinforcement learning (RL) [1]. While RL has proven effective for solving many robotics tasks, its time-intensive training processes hinder real-world deployment. This is particularly critical in underwater environments, due to the limited amount of data available, and the time-consuming and resource-intensive nature of data collection in underwater scenarios, which involves deploying specialized equipment, operating in harsh conditions, and dealing with limited communication and visibility. As such, traditional RL methods are severely limited in marine applications.
The proposed research aims to overcome this critical limitation by leveraging the power of model-based RL. Traditional RL algorithms often rely on extensive trial-and-error interactions with the environment to learn optimal behaviours, which is impractical in underwater settings. Model-based RL [2], on the other hand, incorporates learned environmental models that can simulate the dynamics of the underwater world. By interacting with these models, the agent can gain experience and learn effective strategies without the need for excessive real-world data collection. This approach promises to drastically reduce the data dependency of RL algorithms, making them more feasible and efficient for real-world underwater applications.

References

[1] Sutton, R. S., Barto, A. G. (2018). Reinforcement Learning: An Introduction. The MIT Press.
[2] Moerland, T. M., Broekens, J., Plaat, A., & Jonker, C. M. (2023). Model-based reinforcement learning: A survey. Foundations and Trends® in Machine Learning, 16(1), 1-118.

Research theme:

Autonomous Sensing Platforms

Principal supervisor:

Dr Ignacio Carlucho
Heriot-Watt University, School of Engineering & Physical Sciences
ignacio.carlucho@hw.ac.uk

Assistant supervisor:

Professor Yvan Petillot
Heriot-Watt University, School of Engineering & Physical Sciences
y.r.petillot@hw.ac.uk

EPSRC and MoD CDT in Sensing, Processing, and AI for Defence and Security (SPADS)

Apply now >

Aim

Objectives

Description

References

Research theme:

Principal supervisor:

Assistant supervisor:

SPADS CDT: Sensing, Processing, and AI for Defence and Security

Search form

Aim

Objectives

Description

References

Research theme:

Principal supervisor:

Assistant supervisor:

SPADS CDT: Sensing, Processing, and AI for Defence and Security