- About MaX
- Contact us
FLEUR (Full-potential Linearised augmented plane wave in EURope) is a code family for calculating groundstate as well as excited-state properties of solids within the context of density functional theory (DFT). A key difference with respect to the other MAX-codes and indeed most other DFT codes lies in the treatment of all electrons on the same footing. Thereby we can also calculate the core states and investigate effects in which these states change.
FLEUR is based on the full-potential linearised augmented plane wave method, a well established scheme often considered to provide the most accurate DFT results and used as a reference for other methods. The FLEUR family consists of several codes and modules: a versatile DFT code for the ground-state properties of multicomponent magnetic one-, two- and three-dimensional solids. A focus of the code is on non-collinear magnetism, determination of exchange parameters, spin-orbit related properties (topological and Chern insulators, Rashba and Dresselhaus effect, magnetic anisotropies, Dzyaloshinskii-Moriya interaction).
The SPEX code implements many-body perturbation theory (MBPT) for the calculation of the electronic excitation properties of solids. It includes different levels of GW approaches to calculate the electronic self-energy including a relativistic quasiparticle self-consistent GW approach. The experimental KKRnano code, designed for highest parallel scaling, provides the possibility to utilize current supercomputers to their full extend and is applicable to dense-packed crystals.
FLEUR is distributed freely under the MIT license and has a growing user community. Currently, about 3,000 users registered on the FLEUR-webpage. The source-code is available as a git-repository where an issue tracking system can be used to report code-related problems.
FLEUR has been parallelised on several levels: most efficient with a nearly perfect scalability is a MPI enabled distribution of independent k-points. This parallelisation is most useful for periodic systems with the need to sample k-space properties very accurately as frequently is needed to determine transport or topological properties of solids or spin-orbit related quantities, e.g., the magnetic hardness or magnetic anisotropy, respectively. It becomes insufficient in large setups as the number of required k-points are still large but decreases drastically. Hence, a second layer of MPI parallelisation distributing the construction of the eigenvalue problem, the diagonalisation and the evaluation of the charge density is used in such cases. A third level of hybrid parallelism using OpenMP has been added to facilitate the efficient use of systems with many compute cores per memory node. This hybrid parallel version enables the efficient calculation of over setups comprised of several thousand atoms.
Fig.1 Scaling of FLEUR for a single iteration and only a single k-point for three different example systems: TiO2 with 1078 atoms (red) and 2156 atoms (green) and SrTiO3 with 3750 atoms (blue). The measurement was taken on a cluster with two Intel-Skylake processors and 48 core per node (CLAIX 2018 at RWTH Aachen University).
FLEUR on GPUs
Two most computationally intensive parts of the code, matrix setup and diagonalization, are ported onto GPU. An interface to the MAGMA library is implemented and tested for the diagonalization of the matrix showing a considerable speedup (3x on 1 GPU vs. 1 CPU node). The extension of the GPU version to the remaining parts still poses a significant problem to overcome as these parts contain a large variety of algorithms and hence no simple porting strategy can be applied. Since MAGMA library does not accept distributed matrices, the current version of the code only applicable to relatively small unit cells. To overcome this restriction and to be able to exploit the GPU based computational resources efficiently we currently investigate the possibility to extend the distributed matrix setup and diagonalization. Here the lack of proven libraries for matrix diagonalization on the GPU constitutes the most severe bottleneck and we are looking intensively for such solutions.
FLEUR is a feature-full, freely available FLAPW (full-potential linearized augmented planewave) code, based on density-functional theory. The FLAPW-method is an all-electron approach within density functional theory that is universally applicable to all atoms of the periodic table and to systems with compact as well as open structures. It is widely considered to be the most precise electronic structure method in solid state physics.
Among other things, FLEUR allows to calculate:
Although FLEUR calculations can be performed for all kinds of materials, it is especially suited for:
The FLEUR code family is a program package for calculating ground-state as well as excited-state properties of solids. It is based on the full-potential linearized augmented-plane-wave (FLAPW) method. The strength of the FLEUR code lies in applications to bulk, semi-infinite, two- and one-dimensional solids, solids of all chemical elements of the periodic table, solids with complex open structures, low symmetry, with complex non-collinear magnetism in combination with spin-orbit interaction, external electric fields, and the treatment of spin-dependent transport properties. It is an all-electron method and thus treats core and valence electrons and can deal with hyperfine properties. The inclusion of local orbitals allows a systematic extension of the LAPW basis that enables a precise treatment of semicore states, unoccupied states. A large variety of local and semi-local (GGA) exchange and correlation functionals are implemented, including the LDA+U approach.
FLEUR is an open source code distributed under the MIT Licence. The code source can be downloaded from its homepage or from a Gitlab service. FLEUR requires a Fortran and C compiler and as a minimum an BLAS/LAPACK and XML2 library. The code will massively benefit from highly optimized versions of the linear algebra libraries. In addition, the code can use libraries like MPI, SCALAPACK, HDF5, LibXC, ELPA, Elemental, Magma for parallel calculations, structured IO or for advanced functionality.
Different parallelization paradigms are currently implemented in FLEUR, a shared memory OpenMP parallelism and a distributed memory MPI parallelism. The most basic and most efficient parallelisation distributed the different k-point over MPI. This leads to close to perfect scalability and load-balancing. The second and third level of parallelism consists of an additional MPI distribution of the remaining task, especially of the eigenvalue problem and an OpenMP parallelization which enables the efficient use on multi-core nodes. For larger systems these levels of parallelism are more important as the number of k-points will decrease and the memory requirements for a single k-point will increase.
During a typical self-consistency cycle the code will only do limited IO. First the XML input and the initial charge density are read in and the final charge densities and the log/output file are written. It is possible to write also intermediate charge densities which might be advisable for larger simulations to allow restarts.