Publication Date

2021

Document Type

Dissertation/Thesis

First Advisor

Papka, Michael E.

Degree Name

M.S. (Master of Science)

Legacy Department

Department of Computer Science

Abstract

High-performance computing (HPC) resources at facilities such as Argonne National Laboratory's Leadership Computing Facility (ALCF) enable a wide array of scientific experiments and research applications. In day-to-day operation, these platforms collect copious amounts of system, performance, and debugging logs, capturing data about how jobs, individual tasks, and the system as a whole, operate and perform. This thesis builds on previous efforts to examine how these logs can be used to better understand user and application behavior and system resource usage, in addition to demonstrating machine-learning-based (ML-based) techniques for characterizing applications and predicting job behavior using log data. Five datasets collected from the operation of two supercomputers at the ALCF from 2014--2020 were used for the analysis.

We first demonstrate that the usage of ALCF supercomputers is consistent, repetitive, and patterned, suggesting that it is suitable for training ML models. We next investigate the usage of workflow-based HPC jobs compared to ``traditional'' single-task HPC jobs, as well as the utilization of solid-state drive (SSD)-based cache drives on ALCF's Theta supercomputer. From these analyses, we enumerate potentially advantageous changes and adaptations the ALCF and other facilities might consider in current and future systems. We also show that hardware performance counters provide a viable alternative for application identity verification and resource-intensiveness classification using ML-based approaches, accomplishing near-parity in testing accuracy without overhead and coverage constraints faced by prior log-based approaches. Finally, we investigate methods to improve an ML-based technique for application runtime estimation, with implications for job scheduling on HPC systems.

Extent

49 pages

Language

eng

Publisher

Northern Illinois University

Rights Statement

In Copyright

Rights Statement 2

NIU theses are protected by copyright. They may be viewed from Huskie Commons for any purpose, but reproduction or distribution in any format is prohibited without the written permission of the authors.

Media Type

Text

Share

COinS