Computer Science Graduate Seminar

Friday, December 01, 2023, 9:00am

 Reactive Runtimes for Parallel Programming Models on Shared, Distributed, and Heterogeneous Memory Systems

  • Jannis Klinkenberg M.Sc.  - Chair of Computer Science 12
  • Place: IT Center, Kopernikusstraße 6, seminar room 003

 

Abstract

The persistent drive to enhance the scale and precision of scientific simulations over the past decades has created an ever-growing demand for computational resources. This, in turn, has catalyzed several innovations in the architectural design and manufacturing procedures of computer systems, such as the introduction of multi-core processors and multi-socket shared memory systems, which are typically operated in High Performance Computing (HPC) installations. Historically, scientific applications have primarily been developed with the presumption of a uniform execution environment, where every compute node and core within an installation operates at a consistent, unchanging speed and where the execution time can be accurately predicted. However, in the last decades, both hardware and software have become increasingly complex and often exhibit dynamic execution behavior, causing performance fluctuations and run time variability. Consequently, a priori load balancing within and between compute nodes becomes increasingly challenging in such environments. Applications as well as runtime systems therefore demand for new techniques to be able to dynamically react to changing execution conditions and efficiently balance the load.

This thesis presents reactive concepts, runtime implementations, and runtime extensions designed to address the growing complexity of both hardware and software. The objective is to provide portable, vendor-independent solutions to improve application performance, minimize run time variability and mitigate impending load imbalances. Locality-aware task scheduling extensions in OpenMP improve the data locality on contemporary shared memory NUMA architectures by dynamically identifying physical data locations and reactively adjusting the task distribution and scheduling. Further, continuous performance introspection combined with reactively migrating or replicating tasks in distributed memory can efficiently detect and mitigate emerging imbalances at execution time. Lastly, abstracting regular memory allocation allows specifying additional requirements or hints regarding how the data is used throughout the execution, which can be exploited to dynamically guide the data placement on systems with heterogeneous memory. Systematic evaluations demonstrate the effectiveness of the presented concepts and show that placing the burden on runtime systems to proficiently handle these low-level aspects is crucial to achieve high performance on current and future architectures.

 

The computer science lecturers invite interested people to join.