eprintid: 10069632 rev_number: 19 eprint_status: archive userid: 608 dir: disk0/10/06/96/32 datestamp: 2019-03-06 16:29:17 lastmod: 2021-10-16 22:08:38 status_changed: 2019-03-06 16:29:17 type: proceedings_section metadata_visibility: show creators_name: Syrivelis, D creators_name: Reale, A creators_name: Katrinis, K creators_name: Syrigos, I creators_name: Bielski, M creators_name: Theodoropoulos, D creators_name: Pnevmatikatos, DN creators_name: Zervas, G title: A software-defined architecture and prototype for disaggregated memory rack scale systems ispublished: pub divisions: UCL divisions: B04 divisions: C05 divisions: F46 keywords: disaggregation, extended memory, serverless computing, pooled computing, rack scale systems, rack scale datacenters, software-defined systems, cloud datacenters, internetscale computer note: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. abstract: Disaggregation and rack-scale systems have the potential of drastically increasing TCO and utilization of cloud datacenters, while maintaining performance. In this paper, we present a novel rack-scale system architecture featuring software-defined remote memory disaggregation. Our hardware design and operating system extensions enable unmodified applications to dynamically attach to memory segments residing on physically remote memory pools and use such remote segments in a byte-addressable manner, as if they were local to the application. Our system features also a control plane that automates software-defined dynamic matching of compute to memory resources, as driven by datacenter workload needs. We prototyped our system on the commercially available Zynq Ultrascale+ MPSoC platform. To our knowledge, this is the first time a software-defined disaggregated system has been prototyped on commercial hardware and evaluated through industry standard software benchmarks. Our initial results - using benchmarks that are artificially highly adversarial in terms of memory bandwidth - show that disaggregated memory access exhibits a round-trip latency of only 134 clock cycles; and a throughput penalty of as low as 55%, relative to locally-attached memory. We also discuss estimations as to how our findings may translate to applications with pragmatically milder memory aggressiveness levels, as well as innovation avenues across the stack opened up by our work. date: 2018-04-23 date_type: published publisher: IEEE official_url: https://doi.org/10.1109/SAMOS.2017.8344644 oa_status: green full_text_type: other language: eng primo: open primo_central: open_green verified: verified_manual elements_id: 1573812 doi: 10.1109/SAMOS.2017.8344644 isbn_13: 9781538634370 lyricists_name: Zervas, Georgios lyricists_id: GZERV41 actors_name: Zervas, Georgios actors_id: GZERV41 actors_role: owner full_text_status: public publication: Proceedings - 2017 17th International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation, SAMOS 2017 volume: 2017 place_of_pub: Pythagorion, Greece pagerange: 300-307 event_title: 2017 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) event_location: Pythagorion, Greece event_dates: 17-20 July 2017 book_title: 2017 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) citation: Syrivelis, D; Reale, A; Katrinis, K; Syrigos, I; Bielski, M; Theodoropoulos, D; Pnevmatikatos, DN; Syrivelis, D; Reale, A; Katrinis, K; Syrigos, I; Bielski, M; Theodoropoulos, D; Pnevmatikatos, DN; Zervas, G; - view fewer <#> (2018) A software-defined architecture and prototype for disaggregated memory rack scale systems. In: 2017 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS). (pp. pp. 300-307). IEEE: Pythagorion, Greece. Green open access document_url: https://discovery-pp.ucl.ac.uk/id/eprint/10069632/1/IBM_samos2017.pdf