====== Linear solvers ======
===== Introduction =====
Since 20/07/2005, Metafor includes several linear solvers to solve the system at each iteration when implicitly integrating motion equations. Skyline is the traditional solver, but the direct solver [[http://www.pardiso-project.org/|Pardiso]] and iterative sparse solver (ISS), a GMRES implemented in the MKL, can also be used.
By default, as the stiffness matrix is non symmetric the solver used is non symmetric (with symmetric structure). It is however possible to force the stiffness matrix to be symmetric (computing the mean value between upper and lower terms of the matrix) and to use a symmetric solver.
use :
solvermanager = metafor.getSolverManager()
solvermanager.setSymmetric(True) # False by default
===== Skyline solver =====
Default solver. It does not require any specific configuration. However, it is sequential, occupies a lot of memory and is quite slow on big simulations. However, it is very robust, and the code source is available so it can run on every OS. The Skyline is automatically optimized using Sloan Algorithm.
===== Pardiso (DSS) Solver=====
Pardiso solver is a sparse direct solver using CSR format for storing the matrix. Several CPU can be used (SMP parallelism, using OpenMP). It takes less memory and is faster than Skyline, however it has trouble to handle pivots (does not appear that often). In addition, it requires Intel MKL or CXML to run.
Use:
solvermanager = metafor.getSolverManager()
solvermanager.setSolver(DSSolver())
The number of CPUs is set using
Metafor -k n
where ''n'' is the number of CPUs.
===== ISS (Iterative Sparse Solver) =====
ISS solver is an iterative solver (GMRES) available in MKL library. Iterative solvers are not as robust as direct ones. They iterate successively, and iterations can converge quite slowly is the matrix is ill-conditioned. To improve convergence, it is almost necessary to use a preconditioner to decrease the condition number. However, iterative solvers are quite fast if parameterized properly, but the optimal parameters are quite difficult to find. Concerning memory, they use CSR format so do not require too much space. Only the preconditioner is added to nonzero elements.
Use (see for example ''apps.qs.cont2ILU0''):
solver = ISSolver()
solver.setRestart(5) # Restart parameter of the GMRES
solver.setMaxIter(10) # Maximal number of iterations
solver.setTol(1e-1) # Tolerance on the residual
solvermanager = metafor.getSolverManager()
solvermanager.setSolver(solver)
By default, the solver uses a ILU0 preconditioner. It is an incomplete factorization (A~=LU) for which the structure of L and U is the same as A. A more elaborated preconditioner is also available (ILUT). It is also an incomplete factorization, but where the ''nFill'' largest elements are kept for each line of the matrix. It is used with the command:
solver.setPreconditioner(ILUT)
solver.setPrecMaxFill(20) # 20 elements are kept on the lines of L and U
===== MUMPS (MUltifrontal Massively Parallel sparse direct Solver) =====
MUMPS is a sparse direct solver for the solution of large linear algebric systems on distributed memory parallel computers. It implements the multifrontal method, which is a version of Gaussian elimination for large sparse systems of equations, especially those arising from the finite element method. It is written in Fortran 90 with parallelism by MPI and it uses BLAS and ScaLAPACK kernels for dense matrix computations.
solvermanager = metafor.getSolverManager()
solvermanager.setSolver(MUMPSolver())
The input matrix can be supplied to MUMPS in assembled format in coordinate COO (distributed or centralized) or in elemental format. MUMPS can also be used with multiple threads (CPU cores) by using
Metafor -k n
where ''n'' is the number of threads.
===== CUDSS (CUDA Direct Sparse Solver) =====
CUDSS is a GPU-accelerated sparse direct solver using the CSR matrix format. Providing your hardware is equipped with a good Nvidia GPU, leveraging GPU acceleration for solving the finite element system can become beneficial compared to Intel MKL DSS. The CUDSS interface requires CUDA >= 12.0 along with the Nvidia drivers and CUDA Toolkit, and a compatible version of the CUDSS library (i.e. >= 0.6.0.5). Download links are given below, but it is **strongly advised** to use the packages provided by your Linux distro, rather than Nvidia ones, when available.
* Nvidia drivers : https://www.nvidia.com/en-us/drivers
* CUDA Toolkit : https://developer.nvidia.com/cuda/toolkit
* CUDSS : https://developer.download.nvidia.com/compute/cudss/redist/libcudss
solver = CUDSS()
solvermanager = metafor.getSolverManager()
solvermanager.setSolver(solver)
CUDSS is interfaced with a single parameter ''nSteps'' which controls the number of iterative refinement iterations performed on the solution to further improve its accuracy. The default is zero, but using ''nSteps = 2'' can improve accuracy without significantly impacting the computation time.
solver.setNSteps(2) # Default is 0
===== AMGX (Algebraic MultiGrid) =====
AMGX is a GPU-accelerated iterative sparse solver using the CSR matrix format. It requires CUDA > 12.0, CUDA Toolkit and a compatible Nvidia drivers (and hardware). AMGX can provide an accurate solution in a significantly smaller computation time than CUDSS if configured properly, but its overall robustness is weaker than a direct solver.
* AMGX Website : https://developer.nvidia.com/amgx
* AMGX repository : https://github.com/NVIDIA/AMGX
solver = AMGX()
solver.setAlgorithm(algo) # Solver algorithm to use
solver.setPreconditioner(prec) # Preconditioner to use
solver.setMaxIter(1000) # Maximum number of iterations
solver.setAbsTol(1e-12) # Absolute tolerance for the residual
solver.setRelTol(1e-12) # Relative tolerance wrt the initial residual
solver.setRestart(5) # Restart parameter of the GMRES
solvermanager = metafor.getSolverManager()
solvermanager.setSolver(solver)
AMGX offers many options, among which a few have been interfaced in Metafor:
| ''algo'' | ''LinearSolverAlgo''| = ''BICG'' : biconjugate gradient (default) |
| ::: | ::: | = ''GMRES'' : generalized minimal residual |
| ::: | ::: | = ''CG'' : conjugate gradient |
| ''prec'' | ''Preconditioner''| = ''AMG'' : algebraic multigrid (default) |
| ::: | ::: | = ''JACOBI'' : diagonal preconditioner |
| ::: | ::: | = ''DILU'' : diaginal ILU |
| ::: | ::: | = ''GS'' : Gauss-Seidel preconditioner |
| ::: | ::: | = ''NONE'' : no preconditioner |