Available solvers in Manopt.jl
Optimisation problems can be classified with respect to several criteria. The following list of the algorithms is a grouped with respect to the “information” available about a optimisation problem
\[\operatorname*{arg\,min}_{p∈\mathbb M} f(p)\]
Within each group short notes on advantages of the individual solvers, and required properties the cost $f$ should have, are provided. In that list a 🏅 is used to indicate state-of-the-art solvers, that usually perform best in their corresponding group and 🫏 for a maybe not so fast, maybe not so state-of-the-art method, that nevertheless gets the job done most reliably.
Derivative free
For derivative free only function evaluations of $f$ are used.
- Nelder-Mead a simplex based variant, that is using $d+1$ points, where $d$ is the dimension of the manifold.
- Particle Swarm 🫏 use the evolution of a set of points, called swarm, to explore the domain of the cost and find a minimizer.
- CMA-ES uses a stochastic evolutionary strategy to perform minimization robust to local minima of the objective.
First order
Gradient
- Gradient Descent uses the gradient from $f$ to determine a descent direction. Here, the direction can also be changed to be Averaged, Momentum-based, based on Nesterovs rule.
- Conjugate Gradient Descent uses information from the previous descent direction to improve the current (gradient-based) one including several such update rules.
- The Quasi-Newton Method 🏅 uses gradient evaluations to approximate the Hessian, which is then used in a Newton-like scheme, where both a limited memory and a full Hessian approximation are available with several different update rules.
- Steihaug-Toint Truncated Conjugate-Gradient Method a solver for a constrained problem defined on a tangent space.
Subgradient
The following methods require the Riemannian subgradient $∂f$ to be available. While the subgradient might be set-valued, the function should provide one of the subgradients.
- The Subgradient Method takes the negative subgradient as a step direction and can be combined with a step size.
- The Convex Bundle Method (CBM) uses a former collection of sub gradients at the previous iterates and iterate candidates to solve a local approximation to
f
in every iteration by solving a quadratic problem in the tangent space. - The Proximal Bundle Method works similar to CBM, but solves a proximal map-based problem in every iteration.
Second order
- Adaptive Regularisation with Cubics 🏅 locally builds a cubic model to determine the next descent direction.
- The Riemannian Trust-Regions Solver builds a quadratic model within a trust region to determine the next descent direction.
Splitting based
For splitting methods, the algorithms are based on splitting the cost into different parts, usually in a sum of two or more summands. This is usually very well tailored for non-smooth objectives.
Smooth
The following methods require that the splitting, for example into several summands, is smooth in the sense that for every summand of the cost, the gradient should still exist everywhere
- Levenberg-Marquardt minimizes the square norm of $f: \mathcal M→ℝ^d$ provided the gradients of the component functions, or in other words the Jacobian of $f$.
- Stochastic Gradient Descent is based on a splitting of $f$ into a sum of several components $f_i$ whose gradients are provided. Steps are performed according to gradients of randomly selected components.
- The Alternating Gradient Descent alternates gradient descent steps on the components of the product manifold. All these components should be smooth as it is required, that the gradient exists, and is (locally) convex.
Nonsmooth
If the gradient does not exist everywhere, that is if the splitting yields summands that are nonsmooth, usually methods based on proximal maps are used.
- The Chambolle-Pock algorithm uses a splitting $f(p) = F(p) + G(Λ(p))$, where $G$ is defined on a manifold $\mathcal N$ and the proximal map of its Fenchel dual is required. Both these functions can be non-smooth.
- The Cyclic Proximal Point 🫏 uses proximal maps of the functions from splitting $f$ into summands $f_i$
- Difference of Convex Algorithm (DCA) uses a splitting of the (non-convex) function $f = g - h$ into a difference of two functions; for each of these it is required to have access to the gradient of $g$ and the subgradient of $h$ to state a sub problem in every iteration to be solved.
- Difference of Convex Proximal Point uses a splitting of the (non-convex) function $f = g - h$ into a difference of two functions; provided the proximal map of $g$ and the subgradient of $h$, the next iterate is computed. Compared to DCA, the corresponding sub problem is here written in a form that yields the proximal map.
- Douglas—Rachford uses a splitting $f(p) = F(x) + G(x)$ and their proximal maps to compute a minimizer of $f$, which can be non-smooth.
- Primal-dual Riemannian semismooth Newton Algorithm extends Chambolle-Pock and requires the differentials of the proximal maps additionally.
- The Proximal Point uses the proximal map of $f$ iteratively.
Constrained
Constrained problems of the form
\[\begin{align*} \operatorname*{arg\,min}_{p∈\mathbb M}& f(p)\\ \text{such that } & g(p) \leq 0\\&h(p) = 0 \end{align*}\]
For these you can use
- The Augmented Lagrangian Method (ALM), where both
g
andgrad_g
as well ash
andgrad_h
are keyword arguments, and one of these pairs is mandatory. - The Exact Penalty Method (EPM) uses a penalty term instead of augmentation, but has the same interface as ALM.
- The Interior Point Newton Method (IPM) rephrases the KKT system of a constrained problem into an Newton iteration being performed in every iteration.
- Frank-Wolfe algorithm, where besides the gradient of $f$ either a closed form solution or a (maybe even automatically generated) sub problem solver for $\operatorname*{arg\,min}_{q ∈ C} ⟨\operatorname{grad} f(p_k), \log_{p_k}q⟩$ is required, where $p_k$ is a fixed point on the manifold (changed in every iteration).
On the tangent space
- Conjugate Residual a solver for a linear system $\mathcal A[X] + b = 0$ on a tangent space.
- Steihaug-Toint Truncated Conjugate-Gradient Method a solver for a constrained problem defined on a tangent space.
Alphabetical list List of algorithms
Note that the solvers (their AbstractManoptSolverState
, to be precise) can also be decorated to enhance your algorithm by general additional properties, see debug output and recording values. This is done using the debug=
and record=
keywords in the function calls. Similarly, a cache=
keyword is available in any of the function calls, that wraps the AbstractManoptProblem
in a cache for certain parts of the objective.
Technical details
The main function a solver calls is
Manopt.solve!
— Methodsolve!(p::AbstractManoptProblem, s::AbstractManoptSolverState)
run the solver implemented for the AbstractManoptProblem
p
and the AbstractManoptSolverState
s
employing initialize_solver!
, step_solver!
, as well as the stop_solver!
of the solver.
which is a framework that you in general should not change or redefine. It uses the following methods, which also need to be implemented on your own algorithm, if you want to provide one.
Manopt.initialize_solver!
— Functioninitialize_solver!(ams::AbstractManoptProblem, amp::AbstractManoptSolverState)
Initialize the solver to the optimization AbstractManoptProblem
amp
by initializing the necessary values in the AbstractManoptSolverState
amp
.
initialize_solver!(amp::AbstractManoptProblem, dss::DebugSolverState)
Extend the initialization of the solver by a hook to run the DebugAction
that was added to the :Start
entry of the debug lists. All others are triggered (with iteration number 0
) to trigger possible resets
initialize_solver!(ams::AbstractManoptProblem, rss::RecordSolverState)
Extend the initialization of the solver by a hook to run records that were added to the :Start
entry.
Manopt.step_solver!
— Functionstep_solver!(amp::AbstractManoptProblem, ams::AbstractManoptSolverState, k)
Do one iteration step (the i
th) for an AbstractManoptProblem
p
by modifying the values in the AbstractManoptSolverState
ams
.
step_solver!(amp::AbstractManoptProblem, dss::DebugSolverState, k)
Extend the i
th step of the solver by a hook to run debug prints, that were added to the :BeforeIteration
and :Iteration
entries of the debug lists.
step_solver!(amp::AbstractManoptProblem, rss::RecordSolverState, k)
Extend the i
th step of the solver by a hook to run records, that were added to the :Iteration
entry.
Manopt.get_solver_result
— Functionget_solver_result(ams::AbstractManoptSolverState)
get_solver_result(tos::Tuple{AbstractManifoldObjective,AbstractManoptSolverState})
get_solver_result(o::AbstractManifoldObjective, s::AbstractManoptSolverState)
Return the final result after all iterations that is stored within the AbstractManoptSolverState
ams
, which was modified during the iterations.
For the case the objective is passed as well, but default, the objective is ignored, and the solver result for the state is called.
Manopt.get_solver_return
— Functionget_solver_return(s::AbstractManoptSolverState)
get_solver_return(o::AbstractManifoldObjective, s::AbstractManoptSolverState)
determine the result value of a call to a solver. By default this returns the same as get_solver_result
.
get_solver_return(s::ReturnSolverState)
get_solver_return(o::AbstractManifoldObjective, s::ReturnSolverState)
return the internally stored state of the ReturnSolverState
instead of the minimizer. This means that when the state are decorated like this, the user still has to call get_solver_result
on the internal state separately.
get_solver_return(o::ReturnManifoldObjective, s::AbstractManoptSolverState)
return both the objective and the state as a tuple.
Manopt.stop_solver!
— Methodstop_solver!(amp::AbstractManoptProblem, ams::AbstractManoptSolverState, k)
depending on the current AbstractManoptProblem
amp
, the current state of the solver stored in AbstractManoptSolverState
ams
and the current iterate i
this function determines whether to stop the solver, which by default means to call the internal StoppingCriterion
. ams.stop
API for solvers
this is a short overview of the different types of high-level functions are usually available for a solver. Assume the solver is called new_solver
and requires a cost f
and some first order information df
as well as a starting point p
on M
. f
and df
form the objective together called obj
.
Then there are basically two different variants to call
The easy to access call
new_solver(M, f, df, p=rand(M); kwargs...)
new_solver!(M, f, df, p; kwargs...)
Where the start point should be optional. Keyword arguments include the type of evaluation, decorators like debug=
or record=
as well as algorithm specific ones. If you provide an immutable point p
or the rand(M)
point is immutable, like on the Circle()
this method should turn the point into a mutable one as well.
The third variant works in place of p
, so it is mandatory.
This first interface would set up the objective and pass all keywords on the objective based call.
Objective based calls to solvers
new_solver(M, obj, p=rand(M); kwargs...)
new_solver!(M, obj, p; kwargs...)
Here the objective would be created beforehand for example to compare different solvers on the same objective, and for the first variant the start point is optional. Keyword arguments include decorators like debug=
or record=
as well as algorithm specific ones.
This variant would generate the problem
and the state
and verify validity of all provided keyword arguments that affect the state. Then it would call the iterate process.
Manual calls
If you generate the corresponding problem
and state
as the previous step does, you can also use the third (lowest level) and just call
solve!(problem, state)
Closed-form subsolvers
If a subsolver solution is available in closed form, ClosedFormSubSolverState
is used to indicate that.
Manopt.ClosedFormSubSolverState
— TypeClosedFormSubSolverState{E<:AbstractEvaluationType} <: AbstractManoptSolverState
Subsolver state indicating that a closed-form solution is available with AbstractEvaluationType
E
.
Constructor
ClosedFormSubSolverState(; evaluation=AllocatingEvaluation())