UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

A remote fine grained scheduler, with a case study on an Nvidia BlueField DPU device

Legg, Cyrus James Grahame; (2024) A remote fine grained scheduler, with a case study on an Nvidia BlueField DPU device. Doctoral thesis (Ph.D), UCL (University College London. Green open access

[thumbnail of Legg_10192280_Thesis. edited version - redacted.pdf]
Preview
Text
Legg_10192280_Thesis. edited version - redacted.pdf

Download (4MB) | Preview

Abstract

The QuickSched fine-grained task-based scheduler (used in the Swift astrophysical smooth particle hydrodynamics code) was modified so that the scheduler is located in a separate process from the computational threads, with the computational threads calling the scheduler functions via an RPC message loop. Thus, the time-divided scheduler of QuickSched on the computation threads was replaced by a dedicated ‘remote’ process. The efficiency of the scheduler was analysed in view of the likely detriment caused by the messaging. The investigation was of an existing example of the tiled QR factorisation, and various locations of the remote scheduler were tested: on the same and remote hosts on the same LAN as the computational host, and on the general-purpose Arm processor of BlueField cards located on these hosts. Under optimisation of the tile size, a region of high performance was found and was the same region for the original and new remote schedulers. Within that region the new remote scheduler performed as well as the original to within a few percent, and so the new scheduler location is viable despite the additional messaging latency. The possibilities for extra functions for the scheduler opened up by the extra resources of made available to the scheduler being in its own process are discussed. The mechanisms affecting the performance change between the original and new schedulers are complex. In the optimised region, message latency can be insignificant in some cases and in others a decrease in the time spent in kernels on changing to the new scheduler can partially but significantly compensate for the latency introduced. QuickSched’s scheduler rule of keeping unoccupied threads fed with ready tasks was seen to dilute the effectiveness of the rule to allocate tasks to cores having input data for the task in its cache.

Type: Thesis (Doctoral)
Qualification: Ph.D
Title: A remote fine grained scheduler, with a case study on an Nvidia BlueField DPU device
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: Copyright © The Author 2023. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request.
UCL classification: UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Physics and Astronomy
UCL
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10192280
Downloads since deposit
784Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item