Huang, Y;
Zhang, Z;
Che, J;
Yang, Z;
Yang, Q;
Wong, KK;
(2023)
Self-attention reinforcement learning for multi-beam combining in mmWave 3D-MIMO systems.
Science China Information Sciences
, 66
(6)
, Article 162304. 10.1007/s11432-022-3542-6.
Preview |
Text
SCIS_Self_Attention_Reinforcement_learning_Framework_for_Beam__Combining_in_Millimeter_Wave_MIMO_System.pdf - Accepted Version Download (7MB) | Preview |
Abstract
Machine learning (ML) has been empowering all aspects of the wireless communication system design, among which, the reinforcement learning (RL)-based approaches have attracted a lot of research attention since they can interact with the environment directly and learn from the collected experiences efficiently. In this paper, we propose a novel and efficient RL-based multi-beam combining scheme for future millimeter-wave (mmWave) three-dimensional (3D) multi-input multi-output (MIMO) communication systems. The proposed scheme does not require perfect channel state information (CSI) or precise user location information which both are generally difficult to obtain in practice, and well addresses the crucial challenge of computational complexity incurred by the extremely huge state and action spaces associated with multiple users, multiple paths, and multiple 3D beams. In particular, a self-attention deep deterministic policy gradient (DDPG)-based beam selection and combination framework is proposed to learn the 3D beamforming pattern without CSI adaptively. We aim to maximize the sum-rate of the mmWave 3D-MIMO system by optimizing the serving beam set and the corresponding combining weights for each user. To this end, the transformer is incorporated into the DDPG to obtain the global information of the input elements and capture the signal directions precisely, which leads to a near-optimal beamformer design. Simulation results verify the superiority of the proposed self-attention DDPG over conventional ML-based beamforming schemes in terms of sum-rate under various scenarios.
Type: | Article |
---|---|
Title: | Self-attention reinforcement learning for multi-beam combining in mmWave 3D-MIMO systems |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1007/s11432-022-3542-6 |
Publisher version: | https://doi.org/10.1007/s11432-022-3542-6 |
Language: | English |
Additional information: | This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. |
Keywords: | reinforcement learning (RL), deep deterministic policy gradient (DDPG), self-attention, pre-coding/combining, millimeter-wave (mmWave), multi-input multi-output (MIMO) |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Electronic and Electrical Eng |
URI: | https://discovery-pp.ucl.ac.uk/id/eprint/10171234 |
Archive Staff Only
![]() |
View Item |