For a complete list of publications check this PDF or visit my google scholar or dblp pages.

Latest publications

Selective Protection for Sparse Iterative Solvers to Reduce the Resilience Overhead
Hongyang Sun, Ana Gainaru, Manu Shantharam and Padma Raghavan. [IEEE 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2020] (Presentation: video )

Reservation and Checkpointing Strategies for Stochastic Jobs
Ana Gainaru, Brice Goglin, Valentin Honoré, Guillaume Pallez, Padma Raghavan, Yves Robert, Hongyang Sun. [IPDPS 2020] (Paper: INRIA technical report )

Making Speculative Scheduling Robust to Incomplete Data
Ana Gainaru, Guillaume Pallez. [SCALA@SC 2019] (Paper: INRIA technical report)
Code used for this paper here

Speculative Scheduling Techniques for Stochastic HPC Applications
Ana Gainaru, Guillaume Pallez (Aupy), Hongyang Sun, Padma Raghavan [ICPP 2019]

I/O scheduling strategy for periodic applications
Guillaume Aupy, Ana Gainaru, Valentin Le Fevrez [ACM Transactions on Parallel Computing 2019]

On-the-fly scheduling vs. reservation-based scheduling for unpredictable workflows
Ana Gainaru, Hongyang Sun, Guillaume Aupy, Yuankai Huo, Bennett A. Landman, Padma Raghavan [Special Issue of the IJHPCA 2019]

Reservation Strategies for Stochastic Jobs
Guillaume Aupy, Ana Gainaru, Valentin Honor, Padma Raghavan, Yves Robert, Hongyang Sun [IPDPS 2019]

Older selected publications

Using InfiniBand Hardware Gather-Scatter Capabilities to Optimize MPI All-to-All
Richard Graham, Ana Gainaru, Artem Polyaiov and Gilad Shainer [EuroMPI 2016]

Reducing Waste in Large Scale Systems through Introspective Analysis
Leonardo Bautista Gomez, Ana Gainaru, Swann Perarnau, Franck Cappello, Marc Snir, William Kramer [IPDPS 2016]

Scheduling the I/O of HPC applications under congestion
Ana Gainaru, Guillaume Aupy, Anne Benoit, Franck Cappello, Yves Robert, Marc Snir [IPDPS 2015]

Fault prediction under the microscope: A closer look into HPC systems
Ana Gainaru , Franck Cappello, Marc Snir, William Kramer [SC 2012]

Modeling and Tolerating Heterogeneous Failures in Large Parallel Systems
Eric Heien, Derrick Kondo, Ana Gainaru , Dan LaPine, Bill Kramer, Franck Cappello [SC 2011]