UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

Deep Reinforcement Learning for Event-Driven Multi-Agent Decision Processes

Menda, Kunal; Chen, Yi-Chun; Grana, Justin; Bono, James W; Tracey, Brendan D; Kochenderfer, Mykel J; Wolpert, David; (2019) Deep Reinforcement Learning for Event-Driven Multi-Agent Decision Processes. IEEE Transactions on Intelligent Transportation Systems , 20 (4) pp. 1259-1268. 10.1109/TITS.2018.2848264. Green open access

[thumbnail of Akchen_DRL.pdf]
Preview
Text
Akchen_DRL.pdf - Accepted Version

Download (246kB) | Preview

Abstract

The incorporation of macro-actions (temporally extended actions) into multi-agent decision problems has the potential to address the curse of dimensionality associated with such decision problems. Since macro-actions last for stochastic durations, multiple agents executing decentralized policies in cooperative environments must act asynchronously. We present an algorithm that modifies generalized advantage estimation for temporally extended actions, allowing a state-of-the-art policy optimization algorithm to optimize policies in Dec-POMDPs in which agents act asynchronously. We show that our algorithm is capable of learning optimal policies in two cooperative domains, one involving real-time bus holding control and one involving wildfire fighting with unmanned aircraft. Our algorithm works by framing problems as 'event-driven decision processes,' which are scenarios in which the sequence and timing of actions and events are random and governed by an underlying stochastic process. In addition to optimizing policies with continuous state and action spaces, our algorithm also facilitates the use of event-driven simulators, which do not require time to be discretized into time-steps. We demonstrate the benefit of using event-driven simulation in the context of multiple agents taking asynchronous actions. We show that fixed time-step simulation risks obfuscating the sequence in which closely separated events occur, adversely affecting the policies learned. In addition, we show that arbitrarily shrinking the time-step scales poorly with the number of agents.

Type: Article
Title: Deep Reinforcement Learning for Event-Driven Multi-Agent Decision Processes
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/TITS.2018.2848264
Publisher version: http://dx.doi.org/10.1109/tits.2018.2848264
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Science & Technology, Technology, Engineering, Civil, Engineering, Electrical & Electronic, Transportation Science & Technology, Engineering, Transportation, Artificial intelligence, autonomous vehicles, discrete event simulation, distributed decision-making, neural networks, multi-agent systems
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > UCL School of Management
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10188334
Downloads since deposit
231Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item