Learning to Shape Rewards Using a Game of Two Partners

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

Learning to Shape Rewards Using a Game of Two Partners

Mguni, D; Jafferjee, T; Wang, J; Perez-Nieves, N; Song, W; Tong, F; Taylor, ME; ... Yang, Y; + view all (2023) Learning to Shape Rewards Using a Game of Two Partners. In: Williams, B and Chen, Y and Neville, J, (eds.) Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI 2023). (pp. pp. 11604-11612). Association for the Advancement of Artifcial Intelligence: Washington, D.C., USA. Green open access

Preview

Text
2103.09159.pdf - Accepted Version
Download (4MB) | Preview

Abstract

Reward shaping (RS) is a powerful method in reinforcement learning (RL) for overcoming the problem of sparse or uninformative rewards. However, RS typically relies on manually engineered shaping-reward functions whose construction is time-consuming and error-prone. It also requires domain knowledge which runs contrary to the goal of autonomous learning. We introduce Reinforcement Learning Optimising Shaping Algorithm (ROSA), an automated reward shaping framework in which the shaping-reward function is constructed in a Markov game between two agents. A reward-shaping agent (Shaper) uses switching controls to determine which states to add shaping rewards for more efficient learning while the other agent (Controller) learns the optimal policy for the task using these shaped rewards. We prove that ROSA, which adopts existing RL algorithms, learns to construct a shaping-reward function that is beneficial to the task thus ensuring efficient convergence to high-performance policies. We demonstrate ROSA’s properties in three didactic experiments and show its superior performance against state-of-the-art RS algorithms in challenging sparse reward environments.

Type:	Proceedings paper
Title:	Learning to Shape Rewards Using a Game of Two Partners
Event:	37th AAAI Conference on Artificial Intelligence (AAAI 2023)
Dates:	7 Feb 2023 - 14 Feb 2023
ISBN-13:	9781577358800
Open access status:	An open access version is available from UCL Discovery
Publisher version:	https://ojs.aaai.org/index.php/AAAI/article/view/2...
Language:	English
Additional information:	This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
UCL classification:	UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI:	https://discovery-pp.ucl.ac.uk/id/eprint/10185765

Downloads since deposit

380Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item