Paper accepted at Super Computing Conference (SC22)

Paper accepted at Super Computing Conference (SC22)

Our paper A Taxonomy of Error Sources in HPC I/O Machine Learning Models (https://arxiv.org/abs/2204.08180) has been accepted for presentation at SC22 (https://sc22.supercomputing.org/) and publication in the proceedings. This is a joint work with Michel Kinsy’s group (https://stamcenter.asu.edu/people/) at Arizona State University and led by Mihailo Isakov. We propose a taxonomy consisting of five categories of I/O modeling errors: poor application and system modeling, inadequate dataset coverage, I/O contention, and I/O noise. We develop litmus tests to quantify each category, allowing researchers to narrow down failure modes, enhance I/O throughput models, and improve future generations of HPC logging and analysis tools.

Avatar
Sandeep Madireddy
Computer Scientist