Conjugate gradient descent
Manopt.conjugate_gradient_descent — Functionconjugate_gradient_descent(M, f, grad_f, p=rand(M))
conjugate_gradient_descent!(M, f, grad_f, p)
conjugate_gradient_descent(M, gradient_objective, p)
conjugate_gradient_descent!(M, gradient_objective, p; kwargs...)perform a conjugate gradient based descent-
\[p_{k+1} = \operatorname{retr}_{p_k} \bigl( s_kδ_k \bigr),\]
where $\operatorname{retr}$ denotes a retraction on the Manifold M and one can employ different rules to update the descent direction $δ_k$ based on the last direction $δ_{k-1}$ and both gradients $\operatorname{grad}f(x_k)$,$\operatorname{grad} f(x_{k-1})$. The Stepsize $s_k$ may be determined by a Linesearch.
Alternatively to f and grad_f you can provide the AbstractManifoldFirstOrderObjective gradient_objective directly.
Available update rules are SteepestDescentCoefficientRule, which yields a gradient_descent, ConjugateDescentCoefficient (the default), DaiYuanCoefficientRule, FletcherReevesCoefficient, HagerZhangCoefficient, HestenesStiefelCoefficient, LiuStoreyCoefficient, and PolakRibiereCoefficient. These can all be combined with a ConjugateGradientBealeRestartRule rule.
They all compute $β_k$ such that this algorithm updates the search direction as
\[δ_k=\operatorname{grad}f(p_k) + β_k \delta_{k-1}\]
Input
M::AbstractManifold: a Riemannian manifold $\mathcal M$f: a cost function $f: \mathcal M→ ℝ$ implemented as(M, p) -> vgrad_f: the (Riemannian) gradient $\operatorname{grad}f: \mathcal M → T_{p}\mathcal M$ of f as a function(M, p) -> Xor a function(M, X, p) -> XcomputingXin-placep: a point on the manifold $\mathcal M$
Keyword arguments
coefficient::DirectionUpdateRule=ConjugateDescentCoefficient(): rule to compute the descent direction update coefficient $β_k$, as a functor, where the resulting function maps are(amp, cgs, k) -> βwithampanAbstractManoptProblem,cgsis theConjugateGradientDescentState, andkis the current iterate.restart_condition::AbstractRestartCondition=[NeverRestart])(@ref)(): rule when the algorithm should restart, i.e. use the negative gradient instead of the computed direction, as a functior where the resulting function maps are(amp, cgs, k) -> corr::Boolwithampan [AbstractManoptProblem](@ref),cgsis the [ConjugateGradientDescentState](@ref), andk` is the current iterate.differential=nothing: specify a specific function to evaluate the differential. By default, $Df(p)[X] = ⟨\operatorname{grad}f(p),X⟩$. is usedevaluation=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsstepsize=ArmijoLinesearch(): a functor inheriting fromStepsizeto determine a step sizestopping_criterion=StopAfterIteration(500)|StopWhenGradientNormLess(1e-8): a functor indicating that the stopping criterion is fulfilledvector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
If you provide the ManifoldFirstOrderObjective directly, the evaluation= keyword is ignored. The decorations are still applied to the objective.
Output
The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.
Manopt.conjugate_gradient_descent! — Functionconjugate_gradient_descent(M, f, grad_f, p=rand(M))
conjugate_gradient_descent!(M, f, grad_f, p)
conjugate_gradient_descent(M, gradient_objective, p)
conjugate_gradient_descent!(M, gradient_objective, p; kwargs...)perform a conjugate gradient based descent-
\[p_{k+1} = \operatorname{retr}_{p_k} \bigl( s_kδ_k \bigr),\]
where $\operatorname{retr}$ denotes a retraction on the Manifold M and one can employ different rules to update the descent direction $δ_k$ based on the last direction $δ_{k-1}$ and both gradients $\operatorname{grad}f(x_k)$,$\operatorname{grad} f(x_{k-1})$. The Stepsize $s_k$ may be determined by a Linesearch.
Alternatively to f and grad_f you can provide the AbstractManifoldFirstOrderObjective gradient_objective directly.
Available update rules are SteepestDescentCoefficientRule, which yields a gradient_descent, ConjugateDescentCoefficient (the default), DaiYuanCoefficientRule, FletcherReevesCoefficient, HagerZhangCoefficient, HestenesStiefelCoefficient, LiuStoreyCoefficient, and PolakRibiereCoefficient. These can all be combined with a ConjugateGradientBealeRestartRule rule.
They all compute $β_k$ such that this algorithm updates the search direction as
\[δ_k=\operatorname{grad}f(p_k) + β_k \delta_{k-1}\]
Input
M::AbstractManifold: a Riemannian manifold $\mathcal M$f: a cost function $f: \mathcal M→ ℝ$ implemented as(M, p) -> vgrad_f: the (Riemannian) gradient $\operatorname{grad}f: \mathcal M → T_{p}\mathcal M$ of f as a function(M, p) -> Xor a function(M, X, p) -> XcomputingXin-placep: a point on the manifold $\mathcal M$
Keyword arguments
coefficient::DirectionUpdateRule=ConjugateDescentCoefficient(): rule to compute the descent direction update coefficient $β_k$, as a functor, where the resulting function maps are(amp, cgs, k) -> βwithampanAbstractManoptProblem,cgsis theConjugateGradientDescentState, andkis the current iterate.restart_condition::AbstractRestartCondition=[NeverRestart])(@ref)(): rule when the algorithm should restart, i.e. use the negative gradient instead of the computed direction, as a functior where the resulting function maps are(amp, cgs, k) -> corr::Boolwithampan [AbstractManoptProblem](@ref),cgsis the [ConjugateGradientDescentState](@ref), andk` is the current iterate.differential=nothing: specify a specific function to evaluate the differential. By default, $Df(p)[X] = ⟨\operatorname{grad}f(p),X⟩$. is usedevaluation=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsstepsize=ArmijoLinesearch(): a functor inheriting fromStepsizeto determine a step sizestopping_criterion=StopAfterIteration(500)|StopWhenGradientNormLess(1e-8): a functor indicating that the stopping criterion is fulfilledvector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
If you provide the ManifoldFirstOrderObjective directly, the evaluation= keyword is ignored. The decorations are still applied to the objective.
Output
The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.
State
Manopt.ConjugateGradientDescentState — TypeConjugateGradientState <: AbstractGradientSolverStatespecify options for a conjugate gradient descent algorithm, that solves a [DefaultManoptProblem].
Fields
p::P: a point on the manifold $\mathcal M$ storing the current iterateX::T: a tangent vector at the point $p$ on the manifold $\mathcal M$δ: the current descent direction, also a tangent vectorβ: the current update coefficient rule, see .coefficient: function to determine the newβrestart_condition: anAbstractRestartConditionto determine how to handle non-descent directions.stepsize::Stepsize: a functor inheriting fromStepsizeto determine a step sizestop::StoppingCriterion: a functor indicating that the stopping criterion is fulfilledretraction_method::AbstractRetractionMethod: a retraction $\operatorname{retr}$ to use, see the section on retractionsvector_transport_method::AbstractVectorTransportMethodP: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Constructor
ConjugateGradientState(M::AbstractManifold; kwargs...)where the last five fields can be set by their names as keyword and the X can be set to a tangent vector type using the keyword initial_gradient which defaults to zero_vector(M,p), and δ is initialized to a copy of this vector.
Keyword arguments
The following fields from above <re keyword arguments
initial_gradient=zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$p=rand(M): a point on the manifold $\mathcal M$ to specify the initial valuecoefficient=[ConjugateDescentCoefficient](@ref)(): specify a CG coefficient, see also the [ManifoldDefaultsFactory`](@ref).restart_condition=[NeverRestart](@ref)()`: specify a restart condition. It defaults to never restart.stepsize=default_stepsize(M, ConjugateGradientDescentState; retraction_method=retraction_method): a functor inheriting fromStepsizeto determine a step sizestopping_criterion=StopAfterIteration(500)|StopWhenGradientNormLess(1e-8)): a functor indicating that the stopping criterion is fulfilledretraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsvector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
See also
conjugate_gradient_descent, DefaultManoptProblem, ArmijoLinesearch
Available coefficients
The update rules act as DirectionUpdateRule, which internally always first evaluate the gradient itself.
Manopt.ConjugateDescentCoefficient — FunctionConjugateDescentCoefficient()
ConjugateDescentCoefficient(M::AbstractManifold)Compute the (classical) conjugate gradient coefficient based on [Fle87] adapted to manifolds
Denote the last iterate and gradient by $p_k,X_k$, the current iterate and gradient by $p_{k+1}, X_{k+1}$, respectively, as well as the last update direction by $δ_k$.
Then the coefficient reads
\[β_k = \frac{\mathrm{D}_{}f(p_{k+1})[X_{k+1}]}{\mathrm{D}_{}f(p_k)[-δ_k]} = \frac{\lVert X_{k+1} \rVert_{p_{k+1}}^2}{⟨-δ_k,X_k⟩_{p_k}}\]
The second one it the one usually stated, while the first one avoids to use the metric inner. The first one is implemented here, but falls back to calling inner if there is no dedicated differential available.
This function generates a ManifoldDefaultsFactory for ConjugateDescentCoefficientRule. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
Manopt.ConjugateGradientBealeRestart — FunctionConjugateGradientBealeRestart(direction_update::Union{DirectionUpdateRule,ManifoldDefaultsFactory}; kwargs...)
ConjugateGradientBealeRestart(M::AbstractManifold, direction_update::Union{DirectionUpdateRule,ManifoldDefaultsFactory}; kwargs...)Compute a conjugate gradient coefficient with a potential restart, when two directions are nearly orthogonal. See [HZ06, page 12] (in the preprint, page 46 in Journal page numbers). This method is named after E. Beale from his proceedings paper in 1972 [Bea72]. This method acts as a decorator to any existing DirectionUpdateRule direction_update.
Denote the last iterate and gradient by $p_k,X_k$, the current iterate and gradient by $p_{k+1}, X_{k+1}$, respectively, as well as the last update direction by $δ_k$.
Then a restart is performed, hence $β_k = 0$ returned if
\[ \frac{⟨X_{k+1}, \mathcal T_{p_{k+1}←p_k}X_k⟩}{\lVert X_k \rVert_{p_k}} > ε,\]
where $ε$ is the threshold, which is set by default to 0.2, see [Pow77]
Input
direction_update: aDirectionUpdateRuleor a correspondingManifoldDefaultsFactoryto produce such a rule.
Keyword arguments
vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transportsthreshold=0.2
This function generates a ManifoldDefaultsFactory for ConjugateGradientBealeRestartRule. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
Manopt.DaiYuanCoefficient — FunctionDaiYuanCoefficient(; kwargs...)
DaiYuanCoefficient(M::AbstractManifold; kwargs...)Computes an update coefficient for the conjugate_gradient_descent algorithm based on [DY99] adapted to Riemannian manifolds.
Denote the last iterate and gradient by $p_k,X_k$, the current iterate and gradient by $p_{k+1}, X_{k+1}$, respectively, as well as the last update direction by $δ_k$.
Let $ν_k = X_{k+1} - \mathcal T_{p_{k+1}←p_k}X_k$, where $\mathcal T_{⋅←⋅}$ denotes a vector transport.
Then the coefficient reads
\[β_k = = \frac{\mathrm{D}_{}f(p_{k+1})[X_{k+1}]}{⟨δ_k,ν_k⟩_{p_{k+1}}} = \frac{\lVert X_{k+1} \rVert_{p_{k+1}}^2}{⟨\mathcal T_{p_{k+1}←p_k}δ_k, ν_k⟩_{p_{k+1}}}\]
The second one it the one usually stated, while the first one avoids to use the metric inner. The first one is implemented here, but falls back to calling inner if there is no dedicated differential available.
Keyword arguments
vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
This function generates a ManifoldDefaultsFactory for DaiYuanCoefficientRule. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
Manopt.FletcherReevesCoefficient — FunctionFletcherReevesCoefficient()
FletcherReevesCoefficient(M::AbstractManifold)Computes an update coefficient for the conjugate_gradient_descent algorithm based on [FR64] adapted to manifolds
Denote the last iterate and gradient by $p_k,X_k$, the current iterate and gradient by $p_{k+1}, X_{k+1}$, respectively, as well as the last update direction by $δ_k$.
Then the coefficient reads
\[β_k = \frac{\mathrm{D}_{}f(p_{k+1})[X_{k+1}]}{\mathrm{D}_{}f(p_k)[X_k]} = \frac{\lVert X_{k+1} \rVert_{p_{k+1}}^2}{\lVert X_k \rVert_{p_k}^2}\]
The second one it the one usually stated, while the first one avoids to use the metric inner. The first one is implemented here, but falls back to calling inner if there is no dedicated differential available.
This function generates a ManifoldDefaultsFactory for FletcherReevesCoefficientRule. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
Manopt.HagerZhangCoefficient — FunctionHagerZhangCoefficient(; kwargs...)
HagerZhangCoefficient(M::AbstractManifold; kwargs...)Computes an update coefficient for the conjugate_gradient_descent algorithm based on [FR64] adapted to manifolds
Denote the last iterate and gradient by $p_k,X_k$, the current iterate and gradient by $p_{k+1}, X_{k+1}$, respectively, as well as the last update direction by $δ_k$.
Let $ν_k = X_{k+1} - \mathcal T_{p_{k+1}←p_k}X_k$, where $\mathcal T_{⋅←⋅}$ denotes a vector transport.
Then the coefficient reads
\[β_k = \Bigl⟨ν_k - \frac{2\lVert ν_k \rVert_{p_{k+1}}^2}{⟨\mathcal T_{p_{k+1}←p_k}δ_k, ν_k⟩_{p_{k+1}}} \mathcal T_{p_{k+1}←p_k}δ_k, \frac{X_{k+1}}{⟨\mathcal T_{p_{k+1}←p_k}δ_k, ν_k⟩_{p_{k+1}}} \Bigr⟩_{p_{k+1}}.\]
This method includes a numerical stability proposed by those authors.
Keyword arguments
vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
This function generates a ManifoldDefaultsFactory for HagerZhangCoefficientRule. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
Manopt.HestenesStiefelCoefficient — FunctionHestenesStiefelCoefficient(; kwargs...)
HestenesStiefelCoefficient(M::AbstractManifold; kwargs...)Computes an update coefficient for the conjugate_gradient_descent algorithm based on [HS52] adapted to manifolds
Denote the last iterate and gradient by $p_k,X_k$, the current iterate and gradient by $p_{k+1}, X_{k+1}$, respectively, as well as the last update direction by $δ_k$.
Let $ν_k = X_{k+1} - \mathcal T_{p_{k+1}←p_k}X_k$, where $\mathcal T_{⋅←⋅}$ denotes a vector transport.
Then the coefficient reads
\[\begin{aligned} β_k &= \frac{\mathrm{D}_{}f(p_{k+1})[ν_k]}{\mathrm{D}_{}f(p_{k+1})[\mathcal T_{p_{k+1}←p_k}δ_k] - \mathrm{D}_{}f(p_k)[δ_k]} \\&= \frac{⟨X_{k+1},ν_k⟩_{p_{k+1}}}{⟨\mathcal T_{p_{k+1}←p_k}δ_k,X_{k+1}⟩_{p_{k+1}} - ⟨δ_k,X_k⟩_{p_{k}}} \\&= \frac{⟨X_{k+1},ν_k⟩_{p_{k+1}}}{⟨\mathcal T_{p_{k+1}←p_k}δ_k,ν_k⟩_{p_{k+1}}}, \end{aligned}\]
The third one is the one usually stated, while the first one avoids to use the metric inner. The first one is implemented here, but falls back to calling inner if there is no dedicated differential available.
Keyword arguments
vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
This function generates a ManifoldDefaultsFactory for HestenesStiefelCoefficientRule. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
Manopt.HybridCoefficient — FunctionHybridCoefficient(coefficients::AbstractArray{Union{DirectionUpdateRule,ManifoldDefaultsFactory}}; kwargs...)
HybridCoefficient(M::AbstractManifold, coefficients::AbstractArray{Union{DirectionUpdateRule,ManifoldDefaultsFactory}}; kwargs...)Computes an hybrid update coefficient for the conjugate_gradient_descent.
Given coefficients $β_i$ for $i = 1,...,m$, a lower bound coefficient $β_0$, and a scalar factor $σ$ for the lower bound, this coefficient computes
\[β_k = \max\{σ * β_0, \min(β_1, .... β_m)\bigr)\}\]
This includes the HS-DY and FR-PRP hybrid parameters introduced in [SI20] and [SI21]
Input
args...: CG coefficients of typeDirectionUpdateRuleor a correspondingManifoldDefaultsFactoryto produce such a rule, of which the minimum is taken in the
hybrid rule
Keyword arguments
lower_bound=SteepestDescentCoefficient(): a lower boundDirectionUpdateRuleor a corresponding
ManifoldDefaultsFactory for the resulting value of β
lower_bound_scale=1.0: a scalar to multiply the lower bound coefficient by.
Examples
The FR-PRP parameter reads
\[β_k^{\mathrm{FR-PRP}} = \max\{0, \min(β_k^{FR}, β_k^{PRP})\bigr)\}\]
and can be implemented using
HybridCoefficient(FletcherReevesCoefficient(),PolakRibiereCoefficient())
The HS-DY parameter with parameter 0<σ<1 reads
\[β_k^{\mathrm{HS-DY}} = \max\bigl(-σ β_k^{DY}, \min(β_k^{HS}, β_k^{DY})\bigr)\]
and can be implemented using HybridCoefficient(HestenesStiefelCoefficient(),DaiYuanCoefficient(); lower_bound =DaiYuanCoefficient(), lower_bound_scale = -σ)
This function generates a ManifoldDefaultsFactory for HybridCoefficientRule. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
Manopt.LiuStoreyCoefficient — FunctionLiuStoreyCoefficient(; kwargs...)
LiuStoreyCoefficient(M::AbstractManifold; kwargs...)Computes an update coefficient for the conjugate_gradient_descent algorithm based on [LS91] adapted to manifolds
Denote the last iterate and gradient by $p_k,X_k$, the current iterate and gradient by $p_{k+1}, X_{k+1}$, respectively, as well as the last update direction by $δ_k$.
Let $ν_k = X_{k+1} - \mathcal T_{p_{k+1}←p_k}X_k$, where $\mathcal T_{⋅←⋅}$ denotes a vector transport.
Then the coefficient reads
\[β_k = - \frac{\mathrm{D}_{}f(p_{k+1})[ν_k]}{\mathrm{D}_{}f(p_k)[δ_k]} = - \frac{⟨X_{k+1},ν_k⟩_{p_{k+1}}}{⟨δ_k,X_k⟩_{p_k}}.\]
The second one it the one usually stated, while the first one avoids to use the metric inner. The first one is implemented here, but falls back to calling inner if there is no dedicated differential available.
Keyword arguments
vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
This function generates a ManifoldDefaultsFactory for LiuStoreyCoefficientRule. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
Manopt.PolakRibiereCoefficient — FunctionPolakRibiereCoefficient(; kwargs...)
PolakRibiereCoefficient(M::AbstractManifold; kwargs...)Computes an update coefficient for the conjugate_gradient_descent algorithm based on [PR69] adapted to Riemannian manifolds.
Denote the last iterate and gradient by $p_k,X_k$, the current iterate and gradient by $p_{k+1}, X_{k+1}$, respectively, as well as the last update direction by $δ_k$.
Let $ν_k = X_{k+1} - \mathcal T_{p_{k+1}←p_k}X_k$, where $\mathcal T_{⋅←⋅}$ denotes a vector transport.
Then the coefficient reads
\[β_k = \frac{\mathrm{D}_{}f(p_{k+1})[ν_k]}{\mathrm{D}_{}f(p_k)[X_k]} = \frac{⟨X_{k+1},ν_k⟩_{p_{k+1}}}{\lVert X_k \rVert_{{p_k}}^2}.\]
The second one is the one usually stated, while the first one avoids to use the metric inner. The first one is implemented here, but falls back to calling inner if there is no dedicated differential available.
Keyword arguments
vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
This function generates a ManifoldDefaultsFactory for PolakRibiereCoefficientRule. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
Manopt.SteepestDescentCoefficient — FunctionSteepestDescentCoefficient()
SteepestDescentCoefficient(M::AbstractManifold)Computes an update coefficient for the conjugate_gradient_descent algorithm so that is falls back to a gradient_descent method, that is
\[β_k = 0\]
This function generates a ManifoldDefaultsFactory for SteepestDescentCoefficient. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
Restart rules
The update rules might produce update steps that are not a descent direction, or at least be only approximately one. In these cases the following restart rules can be specified.
Manopt.AbstractRestartCondition — TypeAbstractRestartConditionA general struct, that indicates then to restart. It is used within the ConjugateGradientDescentState.
It is implemented to work as a functor (problem, state, iteration) -> true|false and what is done in the restart case (true) is decided by the single solver.
Manopt.NeverRestart — TypeNeverRestart <: AbstractRestartConditionA restart strategy that indicates to never restart.
Manopt.RestartOnNonDescent — TypeRestartOnNonDescent <: AbstractRestartConditionA restart strategy that restarts, whenever the search direction δ is not a descent direction, i.e. when
\[ ⟨\operatorname{grad}f(p), δ⟩ > 0,\]
at the current iterate $p$.
Manopt.RestartOnNonSufficientDescent — TypeRestartOnNonSufficientDescent <: AbstractRestartCondition
Fields
κ: the sufficient decrease factor
A restart strategy that indicates to restart whenever the search direction δ is not a sufficient descent direction, i.e.
\[ ⟨\operatorname{grad}f(p), δ⟩ ≤ - κ \lVert X \rVert_{}^2.\]
at the current iterate $p$.
Internal rules for coefficients
Manopt.ConjugateGradientBealeRestartRule — TypeConjugateGradientBealeRestartRule <: DirectionUpdateRuleA functor (problem, state, k) -> β_k to compute the conjugate gradient update coefficient based on a restart idea of [Bea72], following [HZ06, page 12] adapted to manifolds.
Fields
direction_update::DirectionUpdateRule: the actual rule, that is restartedthreshold::Real: a threshold for the restart check.vector_transport_method::AbstractVectorTransportMethodP: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Constructor
ConjugateGradientBealeRestartRule(
direction_update::Union{DirectionUpdateRule,ManifoldDefaultsFactory};
kwargs...
)
ConjugateGradientBealeRestartRule(
M::AbstractManifold=DefaultManifold(),
direction_update::Union{DirectionUpdateRule,ManifoldDefaultsFactory};
kwargs...
)Construct the Beale restart coefficient update rule adapted to manifolds.
Input
M::AbstractManifold: a Riemannian manifold $\mathcal M$ If this is not provided, theDefaultManifold()fromManifoldsBase.jlis used.direction_update: aDirectionUpdateRuleor a correspondingManifoldDefaultsFactoryto produce such a rule.
Keyword arguments
vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transportsthreshold=0.2
See also
Manopt.DaiYuanCoefficientRule — TypeDaiYuanCoefficientRule <: DirectionUpdateRuleA functor (problem, state, k) -> β_k to compute the conjugate gradient update coefficient based on [DY99] adapted to manifolds
Fields
vector_transport_method::AbstractVectorTransportMethodP: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Constructor
DaiYuanCoefficientRule(M::AbstractManifold; kwargs...)Construct the Dai—Yuan coefficient update rule.
Keyword arguments
vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
See also
Manopt.FletcherReevesCoefficientRule — TypeFletcherReevesCoefficientRule <: DirectionUpdateRuleA functor (problem, state, k) -> β_k to compute the conjugate gradient update coefficient based on [FR64] adapted to manifolds
Constructor
FletcherReevesCoefficientRule()Construct the Fletcher—Reeves coefficient update rule.
See also
Manopt.HagerZhangCoefficientRule — TypeHagerZhangCoefficientRule <: DirectionUpdateRuleA functor (problem, state, k) -> β_k to compute the conjugate gradient update coefficient based on [HZ05] adapted to manifolds
Fields
vector_transport_method::AbstractVectorTransportMethodP: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Constructor
HagerZhangCoefficientRule(M::AbstractManifold; kwargs...)Construct the Hager-Zhang coefficient update rule based on [HZ05] adapted to manifolds.
Keyword arguments
vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
See also
Manopt.HestenesStiefelCoefficientRule — TypeHestenesStiefelCoefficientRuleRule <: DirectionUpdateRuleA functor (problem, state, k) -> β_k to compute the conjugate gradient update coefficient based on [HS52] adapted to manifolds
Fields
vector_transport_method::AbstractVectorTransportMethodP: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Constructor
HestenesStiefelCoefficientRuleRule(M::AbstractManifold; kwargs...)Construct the Hestenes-Stiefel coefficient update rule based on [HS52] adapted to manifolds.
Keyword arguments
vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
See also
Manopt.HybridCoefficientRule — TypeHybridCoefficientRule <: DirectionUpdateRuleA functor (problem, state, k) -> β_k to compute hybrid conjugate gradient update coefficients
Fields
coefficients::NTuple{DirectionUpdateRuleStorage, N}:NTuplecontaining storage wrappers of CG coefficients of which the minimum is takenlower_bound::DirectionUpdateRuleStorage: storage wrapper of lower bound CG coefficientlower_bound_scale::Real: scalar the lower bound is multiplied with
Constructor
HybridCoefficientRule(
M::AbstractManifold, coefficients::Union{DirectionUpdateRule,ManifoldDefaultsFactory}...;
lower_bound::Union{DirectionUpdateRule,ManifoldDefaultsFactory}=SteepestDescentCoefficient(),
lower_bound_scale::Real=1.0
)Construct the hybrid coefficient update rule.
See also
Manopt.LiuStoreyCoefficientRule — TypeLiuStoreyCoefficientRule <: DirectionUpdateRuleA functor (problem, state, k) -> β_k to compute the conjugate gradient update coefficient based on [LS91] adapted to manifolds
Fields
vector_transport_method::AbstractVectorTransportMethodP: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Constructor
LiuStoreyCoefficientRule(M::AbstractManifold; kwargs...)Construct the Lui-Storey coefficient update rule based on [LS91] adapted to manifolds.
Keyword arguments
vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
See also
Manopt.PolakRibiereCoefficientRule — TypePolakRibiereCoefficientRule <: DirectionUpdateRuleA functor (problem, state, k) -> β_k to compute the conjugate gradient update coefficient based on [PR69] adapted to manifolds
Fields
vector_transport_method::AbstractVectorTransportMethodP: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Constructor
PolakRibiereCoefficientRule(M::AbstractManifold; kwargs...)Construct the Dai—Yuan coefficient update rule.
Keyword arguments
vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
See also
Manopt.SteepestDescentCoefficientRule — TypeSteepestDescentCoefficientRule <: DirectionUpdateRuleA functor (problem, state, k) -> β_k to compute the conjugate gradient update coefficient to obtain the steepest direction, that is $β_k=0$.
Constructor
SteepestDescentCoefficientRule()Construct the steepest descent coefficient update rule.
See also
Technical details
The conjugate_gradient_descent solver requires the following functions of a manifold to be available
- A
retract!(M, q, p, X); it is recommended to set thedefault_retraction_methodto a favourite retraction. If this default is set, aretraction_method=does not have to be specified. - A
vector_transport_to!M, Y, p, X, q); it is recommended to set thedefault_vector_transport_methodto a favourite retraction. If this default is set, avector_transport_method=orvector_transport_method_dual=(for $\mathcal N$) does not have to be specified. - By default gradient descent uses
ArmijoLinesearchwhich requiresmax_stepsize(M)to be set and an implementation ofinner(M, p, X). - By default the stopping criterion uses the
normas well, to stop when the norm of the gradient is small, but if you implementedinner, the norm is provided already. - By default the tangent vector storing the gradient is initialized calling
zero_vector(M,p).
Literature
- [Bea72]
- E. M. Beale. A derivation of conjugate gradients. In: Numerical methods for nonlinear optimization, edited by F. A. Lootsma (Academic Press, London, London, 1972); pp. 39–43.
- [DY99]
- Y. H. Dai and Y. Yuan. A Nonlinear Conjugate Gradient Method with a Strong Global Convergence Property. SIAM Journal on Optimization 10, 177–182 (1999).
- [Fle87]
- R. Fletcher. Practical Methods of Optimization. 2 Edition, A Wiley-Interscience Publication (John Wiley & Sons Ltd., 1987).
- [FR64]
- R. Fletcher and C. M. Reeves. Function minimization by conjugate gradients. The Computer Journal 7, 149–154 (1964).
- [HZ06]
- W. W. Hager and H. Zhang. A survey of nonlinear conjugate gradient methods. Pacific Journal of Optimization 2, 35–58 (2006).
- [HZ05]
- W. W. Hager and H. Zhang. A New Conjugate Gradient Method with Guaranteed Descent and an Efficient Line Search. SIAM Journal on Optimization 16, 170–192 (2005).
- [HS52]
- M. Hestenes and E. Stiefel. Methods of conjugate gradients for solving linear systems. Journal of Research of the National Bureau of Standards 49, 409 (1952).
- [LS91]
- Y. Liu and C. Storey. Efficient generalized conjugate gradient algorithms, part 1: Theory. Journal of Optimization Theory and Applications 69, 129–137 (1991).
- [PR69]
- E. Polak and G. Ribière. Note sur la convergence de méthodes de directions conjuguées. Revue française d’informatique et de recherche opérationnelle 3, 35–43 (1969).
- [Pow77]
- M. J. Powell. Restart procedures for the conjugate gradient method. Mathematical Programming 12, 241–254 (1977).
- [SI20]
- H. Sakai and H. Iiduka. Hybrid Riemannian conjugate gradient methods with global convergence properties. Computational Optimization and Applications 77, 811–830 (2020).
- [SI21]
- H. Sakai and H. Iiduka. Sufficient Descent Riemannian Conjugate Gradient Methods. Journal of Optimization Theory and Applications 190, 130–150 (2021).