Scheduling the I/O of AI applications with a focus on medical imaging
Invited Talk, 16th Scheduling for large-scale systems workshop, Knoxville, Tennessee
My talk focuses on ways to build and automate AI workflows by separating the data and computational planes and attaching tasks to data offloading the I/O management to specialized processes. Experiments done on a whole slide image processing workflow for cancer research shows that the framework can leverage the data access and pre-processing patterns in order to fetch in a more efficient way the required input data in the needed format and ordered by the needs of each application within the workflow.
Abstract:
Scientists are moving toward the creation of complex autonomous
workflows that rely on machine learning for various tasks.
Machine learning has been widely used for biomedical tasks and has
achieved remarkable success in many medical imaging applications.
However, biomedical imaging presents unique challenges that bring
a shift in the computational and I/O patterns expected by HPC systems.
The experimental nature of the studies and the high variety
in the sample types and modalities used for training which forces a
high variety of types of AI methods frequently leads to sub-optimal
performance when running on HPC. This talk looks at a new
paradigm for building and automating AI workflows by moving
tasks to data. The prototype framework we propose separates the
data and computational planes and attaches tasks to data offloading
the I/O management to specialized processes. Our experiments with
a whole slide image processing workflow for cancer research shows
that the framework can leverage the data access and pre-processing
patterns in order to fetch in a more efficient way the required input
data in the needed format and ordered by the needs of each application
within the workflow. Results on the Summit supercomputer
show an increase in the I/O performance of over 10x for a single
application run and a system wide throughput increase of over 2x
when running multiple applications concurrently.
Link to the event: https://icl.utk.edu/workshops/scheduling23
Access my slides here
Related paper:
Profiles of upcoming HPC Applications and their Impact on Reservation Strategies
A Gainaru, B Goglin, V Honoré, G Pallez
IEEE Transactions on Parallel and Distributed Systems 32 (5), 1178-1190
DOI: 10.1109/TPDS.2020.3039728