During MAX, a number of open source materials science domain specific libraries have been developed. The libraries, that are specific to the materials modelling domain, contain functions which perform basic quantum chemistry materials science operations like computation of Hamiltonian eigenstates, charge density and potential, atomic forces, Poisson solvers, large-matrix iterative solvers for eigenvalue problems and linear systems, non-linear optimisation solvers at various levels of generality, symmetry and crystal-structure managements, etc. In parallel, mathematical and system libraries – which perform general-purpose system tasks (such as e.g. memory, communication, and mass-storage management) or mathematical tasks (typically, but not limited to, linear algebra and signal analysis such as FFT or wavelet analysis) have been implemented. Appropriate application program interfaces (API’s) to access domain-specific libraries have been also developed. 

CheSS        -         LAXLib & FFTXLib        -         Sirius

 

CheSS

One of the most important tasks in electronic structure codes is the calculation of the density matrix. If not handled properly, this task can easily lead to a bottleneck that limits the performance of the code or even renders big calculations prohibitively expensive.

CheSS is a library that was designed with the goal of enabling electronic structure calculations for very big systems. It is capable of working with sparse matrices, which naturally arise if big systems are treated with a localized basis. Therefore, it is possible to calculate the density matrix with O(N), i.e., the computation cost only increases linearly with the system size.

The CheSS solver uses an expansion based on Chebyshev polynomials to calculate matrix functions (such as the density matrix or the inverse of the overlap matrix), thereby exploiting the sparsity of the input and output matrices. It works best for systems with a finite HOMO-LUMO gap and a small spectral width. CheSS exhibits a two-level parallelization using MPI and OpenMP and can scale to many thousands of cores. It has been converted into a stand-alone library starting from the original codebase within BigDFT. At the moment, it is coupled to the two MAX flagship codes BigDFT and SIESTA.

The performance of CheSS has been benchmarked against PEXSI and (Sca)LAPACK for the calculation of the density matrix and the inverse of the overlap matrix, respectively. CheSS is the most efficient method, as it is demonstrated with more details and performance figures in the publication "Efficient Computation of Sparse Matrix Functions for Large-Scale Electronic Structure Calculations: The CheSS Library". 

Download 

CheSS Launchpad

 

 

LAXLib & FFTXLib

One of the most important obstacles when keeping the codes up to date with hardware is the programming style based on old-style (i.e., non object-oriented) languages. Programming styles in community codes are often naive and lack the modularity and flexibility. From here, the need to disentangle such codes is essential for implementing new features or simply refactoring the application in order to efficiently run on the new architectures. Rewriting from scratch one of these codes is not an option because the communities behind these codes would be disrupted. One of the possible approaches that could permit to evolve the code is to progressively encapsulate the functions and subroutines, breaking up the main application in small (possibly weakly dependent) parts.

This strategy was followed by Quantum ESPRESSO: two main types of kernels were isolated in the independent directories and proposed as candidates for the domain-specific libraries for third-party applications.

The first library, called LAXlib, contains all the low-level linear algebra routines of Quantum ESPRESSO, and in particular those used by the Davidson solver (e.g., the Cannon algorithm for the matrix-matrix product). The LAXlib also contains a mini-app that permits to evaluate the features of a HPC interconnect measuring the linear algebra routines contained therein.

The second library encapsulates all the FFT related functions, including the drivers for several different architectures. The FFTXlib library is self-contained and can be built without any dependencies on the remaining part of the Quantum ESPRESSO suite. Similarly, in the FFTXlib there is a mini-app that permits to mimic the FFT cycle for the SCF calculation of the charge density tuning the parallelization parameters of Quantum ESPRESSO. This mini-app has also been used to test the new implementation using the MPI-3 non blocking collectives. 

Download

LAXlib Github

FFTXlib Github

FFTXlib mini-app

 

 

Sirius

CSCS is working on the initial implementation of the domain specific library for electronic structure calculations which encapsulates the full-potential linearized augmented plane-wave (FP-LAPW) and pseudopotential plane-wave (PP-PW) DFT ground state “engines”. The library is open-source (BSD 2-clause licence) and is freely available. The library is written in C++ with the MPI+OpenMP+CUDA programming model. In order to demonstrate the usage of the SIRIUS library, the fork of Quantum ESPRESSO code was created and modified to work with the SIRIUS API. In the current QE+SIRIUS implementation the generation of the effective potential and density mixing is done by QE and the rest (band diagonalization, density summation, charge augmentation and symmetrization) is done by SIRIUS.

By closely analysing the MAX flagship codes, the following common compute-intensive kernels can be identified:

  • 3D FFT
  • Inner product of the wave-functions
  • Transformation of the wave-functions
  • Orthogonalization of the wave-functions

These kernels have been the focus of the initial implementation and testing phase within the SIRIUS framework while recent developments have focused on the refactoring of the SIRIUS library and on the creation of an independent sub-library with the aforementioned kernels. The sub-library is called Slab Data Distribution Kit (SDDK) and it is designed to work with the wave-functions distributed in “slabs”, where the G+k vector index is partitioned between MPI ranks and the band index is kept as a whole. The sub-library can be compiled and used independently of the main SIRIUS library.

Download

SIRIUS Github

SIRIUS API

SDDK Github