Giordano, Mosè;
Klöwer, Milan;
Churavy, Valentin;
(2022)
Productivity meets Performance: Julia on A64FX.
arXiv.org: Ithaca (NY), USA.
Preview |
Text
Giordano_Productivity meets Performance- Julia on A64FX_pre-print.pdf - Submitted Version Download (2MB) | Preview |
Abstract
The Fujitsu A64FX ARM-based processor is used in supercomputers such as Fugaku in Japan and Isambard 2 in the UK and provides an interesting combination of hardware features such as Scalable Vector Extension (SVE), and native support for reduced-precision floating-point arithmetic. The goal of this paper is to explore performance of the Julia programming language on the A64FX processor, with a particular focus on reduced precision. Here, we present a performance study on axpy to verify the compilation pipeline, demonstrating that Julia can match the performance of tuned libraries. Additionally, we investigate Message Passing Interface (MPI) scalability and throughput analysis on Fugaku showing next to no significant overheads of Julia of its MPI interface. To explore the usability of Julia to target various floating-point precisions, we present results of ShallowWaters.jl, a shallow water model that can be executed a various levels of precision. Even for such complex applications, Julia's type-flexible programming paradigm offers both, productivity and performance.
Type: | Working / discussion paper |
---|---|
Title: | Productivity meets Performance: Julia on A64FX |
Open access status: | An open access version is available from UCL Discovery |
Publisher version: | https://arxiv.org/abs/2207.12762 |
Language: | English |
Additional information: | This version is the author manuscript. For information on re-use, please refer to the publisher's terms and conditions. |
Keywords: | Distributed, Parallel, and Cluster Computing |
UCL classification: | UCL |
URI: | https://discovery-pp.ucl.ac.uk/id/eprint/10177454 |
Archive Staff Only
![]() |
View Item |