1. Experience Porting a Scientific Code from YAKL to Kokkos - James Foucar, Sandia National Labs (10 minutes)
The DoE climate code E3SM recently ported a medium sized scientific code, RRTMGP (computes radiative fluxes in planetary atmospheres), from a kernel launcher called YAKL to Kokkos. We'd like to share tips and pain points from this effort, particularly the struggle to get to performance parity with YAKL. We found that a 1:1 port (YAKL API is very similar to Kokkos) was not nearly sufficient to achieve good performance. The main issues were how to allocate temporary views and dealing with MDRangePolicy.
2. Benchmarking Lattice QCD Staggered Fermion Kernel Written in Kokkos - Simon Schlepphorst, Forschungszentrum Juelich GmbH (10 minutes)
Lattice quantum chromodynamics (QCD) is a numerical approach to studying the interactions of quarks and gluons, where the fundamental eqautions governing their interactions are discretized to a four dimension spacetime lattice. One of the most costly computations is the inversion of the lattice Dirac operator, a large sparse matrix. Calculating this inversion with iterative solvers leads to many applications of that operator. This study builds on previous work where we implemented the staggered fermion Dirac operator as a benchmark in Kokkos. We investigate the effects of the tiling size in combination with the use of a 4D MDRangePolicy and 7D Views.
3. Leveraging Liaisons in Your Network for Software Sustainability - Elaine M. Raybourn, Sandia National Laboratories (10 minutes)
Open source software project sustainability is a sociotechnical endeavor that often extends beyond the efforts of individual projects. HPSF and the Linux Foundation offer rich resources of expertise across communities in industry, academia, and agencies. Leveraging this collective knowledge and experience is vital to enhance project practices, especially in early identification of challenges and potential issues. This lightning talk explores the value of leveraging liaisons — key individuals who are actively participating in cross-team networks, to accelerate project sustainability. Liaisons can bridge gaps, share tacit knowledge and incentivize collaborative efforts across communities, go assist in breaking down silos. The value of leveraging liaisons was identified during the DOE Exascale Computing Project to foster strategic project alignment and outreach. Whether a small team, or a larger network of teams of teams, identifying liaisons early on can foster trust and transparency both within and across teams.
4. Vertex-CFD: A Multi-Physics Solver for Fusion Applications - Marc Olivier Delchini & Daniel Arndt, Oak Ridge National Laboratory (10 minutes)
In this talk we will introduce Vertex-CFD, a multiphysics solver that is being developed in response to needs by Oak Ridge National Laboratory (ORNL) to have accurate simulation software for use in modeling of a fusion blanket problem. Vertex-CFD is built upon Trilinos and Kokkos libraries for compatibility with CPU and GPU platforms. It is designed to generate high-fidelity solutions of multiphysics problems in complex geometries by leveraging state-of-the art computing methods and technologies. We will describe how we leverage Kokkos and Trilinos to solve the governing equations by employing a finite element method and high-order implicit temporal integrators.
5. Toucan: Revolutionizing Microstructure Prediction - Benjamin Stump, ORNL (10 minutes)
Going to describe my code, what it is doing (physically), what I need it to do computationally, how I achieved it using Kokkos and optimized it algorithmically.
6. Performance-Portable Spectral Ewald Summation with PyKokkos - Gabriel K Kosmacher, Oden Institute, The University of Texas at Austin (10 minutes)
We present a performance portable implementation of the Spectral Ewald method, employing shared memory and streaming parallelism to rapidly evaluate periodic two-body potentials in Stokes flow. The method splits dense particle evaluation into near-field and far-field components, where the near-field is singular and the far-field decays rapidly in Fourier space. Far-field interactions resemble a Nonuniform Fast Fourier Transform: source potentials are interpolated onto a uniform grid (p2g), an ndFFT is applied, Fourier potentials are scaled, an ndIFFT is applied, and the potentials are interpolated back (g2p). The p2g, g2p, and near-field (p2p) interactions use Kokkos hierarchical parallelism with scratch-pad memory and thread-vector range reductions.
7. Empowering NSM Supercomputers with Kokkos for Scalable HPC - Harsha Ugave & Samir Shaikh, Centre for Developement of Advanced Computing (C-DAC) (10 minutes)
Kokkos is transforming how high-performance applications run on National Supercomputing Mission (NSM) systems. With NSM deploying a mix of CPUs, GPUs, and other accelerators, ensuring software runs efficiently across all these platforms can be challenging. Kokkos simplifies this by providing a single, flexible programming model that adapts to different hardware without requiring major code changes. It supports multiple backends like CUDA, HIP, SYCL, and OpenMP, making it easier for developers to write performance-portable applications. For NSM’s large-scale supercomputers, Kokkos ensures better performance and scalability, allowing applications to make full use of processors, GPUs, and memory hierarchies. It also optimizes energy efficiency by improving memory access and reducing unnecessary data movement, helping to make supercomputing more sustainable. Since Kokkos is open-source and backed by an active community, it keeps up with emerging technologies, ensuring seamless adoption of next-generation NSM systems and preparing them for the future of exascale computing.
8. Real-Time Performance Characterization of the ADIOS2 Library When Kokkos Is Enabled - Ana Gainaru, Oak Ridge National Laboratory (10 minutes)
Modern performance analysis tools are increasingly capable of capturing a high volume of metrics at ever-finer granularity. This abundance of information presents an opportunity to move beyond post-mortem analysis and leverage data streaming for real-time performance monitoring and decision-making. By streaming performance data, applications can provide immediate feedback, enabling dynamic adjustments and optimizations during execution. Furthermore, this streamed data can be directed to individual scientist workstations, facilitating on-the-fly health checks and user-driven interventions to steer the application's behavior. We will demonstrate the practical application of these concepts within the ADIOS2 library, showcasing how data streaming enables detailed monitoring and analysis of an HPC application during large-scale runs.
9. Cabana: Particles, Structured Grids, and Extensions to Unstructured with Kokkos - Sam Reeve, ORNL (10 minutes)
We discuss updates to Cabana, a Kokkos+MPI library for building particle applications. Cabana was created through the U.S. Department of Energy Exascale Computing Project to enable particle simulation across methods on current and future exascale supercomputers. Cabana includes particle and structured grid parallelism, data structures, algorithms, communication, and interfaces to additional libraries, all extending and working alongside Kokkos. We focus in particular on recent efforts to integrate Cabana particles within Trilinos unstructured grids for broader support of scientific applications. We will highlight further recent Cabana development, performance and portability, and application-level demonstrations.