Ramesh, Shyam Sundhar;
Hu, Yifan;
Chaimalas, Iason;
Mehta, Viraj;
Sessa, Pier Giuseppe;
Ammar, Haitham Bou;
Bogunovic, Ilija;
(2024)
Group Robust Preference Optimization in Reward-free RLHF.
In:
Advances in Neural Information Processing Systems (NeurIPS 2024).
NeurIPS
(In press).
Preview |
Text
Robust_DPO_Neurips_CR_version_4.pdf - Accepted Version Download (1MB) | Preview |
Type: | Proceedings paper |
---|---|
Title: | Group Robust Preference Optimization in Reward-free RLHF |
Event: | 38th Conference on Neural Information Processing Systems (NeurIPS 2024) |
Open access status: | An open access version is available from UCL Discovery |
Publisher version: | https://papers.nips.cc/ |
Language: | English |
Additional information: | This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Electronic and Electrical Eng |
URI: | https://discovery-pp.ucl.ac.uk/id/eprint/10199800 |
Archive Staff Only
![]() |
View Item |