Differences

This shows you the differences between two versions of the page.

--- doc:user:integration:general:solvers [2014/10/07 16:19] – joris
+++ doc:user:integration:general:solvers [2026/04/09 16:31] (current) – lacroix
@@ Line 3: / Line 3: @@
 ===== Introduction =====
-Since 20/07/2005, Metafor includes several linear solvers to solve the system at each iteration when implicitly integrating motion equations. Skyline is the traditional solver, but the direct solver [[http://www.pardiso-project.org/|Pardiso]] and iterative solvers from [[http://www.mcs.anl.gov/petsc//|PETSc]] can also be used. Each of these has advantages and disadvantages.
+Since 20/07/2005, Metafor includes several linear solvers to solve the system at each iteration when implicitly integrating motion equations. Skyline is the traditional solver, but the direct solver [[http://www.pardiso-project.org/|Pardiso]] and iterative sparse solver (ISS), a GMRES implemented in the MKL, can also be used.
+By default, as the stiffness matrix is non symmetric the solver used is non symmetric (with symmetric structure). It is however possible to force the stiffness matrix to be symmetric (computing the mean value between upper and lower terms of the matrix) and to use a symmetric solver.
+use :
+<code python>
+solvermanager = metafor.getSolverManager()
+solvermanager.setSymmetric(True) # False by default
+</code>
 ===== Skyline solver =====
-Default solver. It does not require any specific configuration. However, it is sequential, occupies a lot of memory and is quite slow on big simulations. However, it is very robust, and the code source is available so it can run on every OS.
+Default solver. It does not require any specific configuration. However, it is sequential, occupies a lot of memory and is quite slow on big simulations. However, it is very robust, and the code source is available so it can run on every OS. The Skyline is automatically optimized using Sloan Algorithm.
 ===== Pardiso (DSS) Solver=====
-Pardiso solver is a sparse direct solver using CSR format for storing the matrix. Several CPU can be used (SMP parallelism, using OpenMP). It takes less memory and is faster than Skyline, however it has trouble to handle pivots (does not appear that often). In addition, it requires MKL or CXML to run.
+Pardiso solver is a sparse direct solver using CSR format for storing the matrix. Several CPU can be used (SMP parallelism, using OpenMP). It takes less memory and is faster than Skyline, however it has trouble to handle pivots (does not appear that often). In addition, it requires Intel MKL or CXML to run.
 Use:
 <code python>
-metafor = domain.getMetafor()
+solvermanager = metafor.getSolverManager()
-solvermanager = metafor.getSolverManager();
+solvermanager.setSolver(DSSolver())
-try:
-    solvermanager.setSolver(DSSolver());
-except NameError:
-    pass
 </code>
 The number of CPUs is set using
-  Blas.setNumThreads(n)
+<code>
+Metafor -k n
+</code>
 where ''n'' is the number of CPUs.
@@ Line 37: / Line 44: @@
 <code python>
 solver = ISSolver()
-solver.setRestart(5)    # restart parameter of the GMRES
-solver.setItMax(10)     # maximal number of iterations
+solver.setRestart(5)    # Restart parameter of the GMRES
-solver.setTol(1e-1)     # tolerance on the residual
+solver.setMaxIter(10)   # Maximal number of iterations
+solver.setTol(1e-1)     # Tolerance on the residual
 solvermanager = metafor.getSolverManager()
@@ Line 46: / Line 54: @@
 </code>
-By default, the solver uses a ILU0 preconditioner. It is an incomplete factorization (A~=LU) for which the structure of L and U is the same as A.
+By default, the solver uses a ILU0 preconditioner. It is an incomplete factorization (A~=LU) for which the structure of L and U is the same as A. A more elaborated preconditioner is also available (ILUT). It is also an incomplete factorization, but where the ''nFill'' largest elements are kept for each line of the matrix. It is used with the command:
-A more elaborated preconditioner is also available (ILUT). It is also an incomplete factorization, but where the ''nFill'' largest elements are kept for each line of the matrix. It is used with the command:
+<code python>
+solver.setPreconditioner(ILUT)
+solver.setPrecMaxFill(20) # 20 elements are kept on the lines of L and U
+</code>
+===== MUMPS (MUltifrontal Massively Parallel sparse direct Solver) =====
+MUMPS is a sparse direct solver for the solution of large linear algebric systems on distributed memory parallel computers. It implements the multifrontal method, which is a version of Gaussian elimination for large sparse systems of equations, especially those arising from the finite element method. It is written in Fortran 90 with parallelism by MPI and it uses BLAS and ScaLAPACK kernels for dense matrix computations.
 <code python>
-solver.useILUT(20)    # nFill=20 (only 20 elements are kept on the lines of L and U)
+solvermanager = metafor.getSolverManager()
+solvermanager.setSolver(MUMPSolver())
 </code>
+The input matrix can be supplied to MUMPS in assembled format in coordinate COO (distributed or centralized) or in elemental format. MUMPS can also be used with multiple threads (CPU cores) by using
+<code>
+Metafor -k n
+</code>
+where ''n'' is the number of threads.
+===== CUDSS (CUDA Direct Sparse Solver) =====
+CUDSS is a GPU-accelerated sparse direct solver using the CSR matrix format. Providing your hardware is equipped with a good Nvidia GPU, leveraging GPU acceleration for solving the finite element system can become beneficial compared to Intel MKL DSS. The CUDSS interface requires CUDA >= 12.0 along with the Nvidia drivers and CUDA Toolkit, and a compatible version of the CUDSS library (i.e. >= 0.6.0.5). Download links are given below, but it is **strongly advised** to use the packages provided by your Linux distro, rather than Nvidia ones, when available.
+  * Nvidia drivers : https://www.nvidia.com/en-us/drivers
+  * CUDA Toolkit : https://developer.nvidia.com/cuda/toolkit
+  * CUDSS : https://developer.download.nvidia.com/compute/cudss/redist/libcudss
+<code python>
+solver = CUDSS()
+solvermanager = metafor.getSolverManager()
+solvermanager.setSolver(solver)
+</code>
+CUDSS is interfaced with a single parameter ''nSteps'' which controls the number of iterative refinement iterations performed on the solution to further improve its accuracy. The default is zero, but using ''nSteps = 2'' can improve accuracy without significantly impacting the computation time.
+<code python>
+solver.setNSteps(2) # Default is 0
+</code>
+===== AMGX (Algebraic MultiGrid) =====
+AMGX is a GPU-accelerated iterative sparse solver using the CSR matrix format. It requires CUDA > 12.0, CUDA Toolkit and a compatible Nvidia drivers (and hardware). AMGX can provide an accurate solution in a significantly smaller computation time than CUDSS if configured properly, but its overall robustness is weaker than a direct solver.
+  * AMGX Website : https://developer.nvidia.com/amgx
+  * AMGX repository : https://github.com/NVIDIA/AMGX
+<code python>
+solver = AMGX()
+solver.setAlgorithm(algo)       # Solver algorithm to use
+solver.setPreconditioner(prec)  # Preconditioner to use
+solver.setMaxIter(1000)         # Maximum number of iterations
+solver.setAbsTol(1e-12)         # Absolute tolerance for the residual
+solver.setRelTol(1e-12)         # Relative tolerance wrt the initial residual
+solver.setRestart(5)            # Restart parameter of the GMRES
+solvermanager = metafor.getSolverManager()
+solvermanager.setSolver(solver)
+</code>
+AMGX offers many options, among which a few have been interfaced in Metafor:
+| ''algo'' | ''LinearSolverAlgo''| = ''BICG'' : biconjugate gradient (default) |
+| ::: | ::: | = ''GMRES'' : generalized minimal residual |
+| ::: | ::: | = ''CG'' : conjugate gradient |
+| ''prec'' | ''Preconditioner''| = ''AMG'' : algebraic multigrid (default) |
+| ::: | ::: | = ''JACOBI'' : diagonal preconditioner |
+| ::: | ::: | = ''DILU'' : diaginal ILU |
+| ::: | ::: | = ''GS'' : Gauss-Seidel preconditioner |
+| ::: | ::: | = ''NONE'' : no preconditioner |