UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

Comparison of generative AI performance on undergraduate and postgraduate written assessments in the biomedical sciences

Williams, Andrew; (2024) Comparison of generative AI performance on undergraduate and postgraduate written assessments in the biomedical sciences. International Journal of Educational Technology in Higher Education , 21 , Article 52. 10.1186/s41239-024-00485-y. Green open access

[thumbnail of Williams-2024-International_Journal_of_Educational_Technology_in_Higher_Education.pdf]
Preview
Text
Williams-2024-International_Journal_of_Educational_Technology_in_Higher_Education.pdf - Other

Download (949kB) | Preview

Abstract

The value of generative AI tools in higher education has received considerable attention. Although there are many proponents of its value as a learning tool, many are concerned with the issues regarding academic integrity and its use by students to compose written assessments. This study evaluates and compares the output of three commonly used generative AI tools, ChatGPT, Bing and Bard. Each AI tool was prompted with an essay question from undergraduate (UG) level 4 (year 1), level 5 (year 2), level 6 (year 3) and postgraduate (PG) level 7 biomedical sciences courses. Anonymised AI generated output was then evaluated by four independent markers, according to specified marking criteria and matched to the Frameworks for Higher Education Qualifications (FHEQ) of UK level descriptors. Percentage scores and ordinal grades were given for each marking criteria across AI generated papers, inter-rater reliability was calculated using Kendall’s coefficient of concordance and generative AI performance ranked. Across all UG and PG levels, ChatGPT performed better than Bing or Bard in areas of scientific accuracy, scientific detail and context. All AI tools performed consistently well at PG level compared to UG level, although only ChatGPT consistently met levels of high attainment at all UG levels. ChatGPT and Bing did not provide adequate references, while Bing falsified references. In conclusion, generative AI tools are useful for providing scientific information consistent with the academic standards required of students in written assignments. These findings have broad implications for the design, implementation and grading of written assessments in higher education.

Type: Article
Title: Comparison of generative AI performance on undergraduate and postgraduate written assessments in the biomedical sciences
Open access status: An open access version is available from UCL Discovery
DOI: 10.1186/s41239-024-00485-y
Publisher version: http://dx.doi.org/10.1186/s41239-024-00485-y
Language: English
Additional information: This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Keywords: Assessment, Artifcial intelligence, Higher education, Academic writing, ChatGPT, Essay, Biomedical science, Medicine
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences > Div of Medicine
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10197256
Downloads since deposit
612Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item