Freitag, 18. August, 2023, 09:00 Uhr
Pattern-based Abstractions for Parallel Programs
- Julian Miller M.Sc. - Lehrstuhl Informatik 12
- Ort: IT Center, Kopernikusstraße 6, Seminarraum 001
The computational demands in sciences and engineering are quickly rising with the increasing complexities of simulation and the availability of extensive data. This demand is satisfied with large clusters of computers and specialized hardware accelerators. However, programming such systems is challenging and time-consuming, and the massive concurrency is error-prone. The software developers are faced with deriving a well-scaling solution while preserving correctness. These development challenges are aggravated by the quickly evolving hardware landscape of high-performance computing (HPC).
This work investigates the key challenges when developing highly productive and performant parallel programs. This analysis is based on extensive human-subject studies with a diverse set of parallel programs and programmers. It uncovers quantitative and measurable productivity metrics, the main impact factors for developing parallel programs efficiently, and cost estimation methods for developing software. Based on this analysis, an abstract model of parallel algorithms is proposed to mitigate these challenges. It is based on a strict separation between the algorithmic structure of a program and its executed functions. Rich and high-level optimization potentials are revealed by decomposing parallel programs into a hierarchical structure of parallel patterns.
A static performance model and optimization and scheduling algorithms are introduced to leverage these optimization potentials. A proof of concept development pipeline is proposed exposing this pattern-based programming approach to software developers: First, parallel programs may be specified in the proposed Parallel Pattern Language (PPL) that closely follows the mathematical definition of parallel algorithms. Alternatively, existing codes can be translated into the proposed hierarchical pattern structure with pattern- detection methods. Second, the hierarchical pattern structure is extracted and global transformations are applied to minimize the overall runtime for a target hardware architecture. Third, the optimized code and its scheduling are generated in a source-to-source fashion for heterogeneous systems with shared and distributed memory and accelerators. The proposed approach and proof of concept implementation are evaluated on real-world parallel algorithms.
Es laden ein: die Dozentinnen und Dozenten der Informatik