Steihaug-Toint truncated conjugate gradient method
Solve the constraint optimization problem on the tangent space
\[\begin{align*} \operatorname*{arg\,min}_{Y ∈ T_p\mathcal{M}}&\ m_p(Y) = f(p) + ⟨\operatorname{grad}f(p), Y⟩_p + \frac{1}{2} ⟨\mathcal{H}_p[Y], Y⟩_p\\ \text{such that}& \ \lVert Y \rVert_p ≤ Δ \end{align*}\]
on the tangent space $T_p\mathcal M$ of a Riemannian manifold $\mathcal M$ by using the Steihaug-Toint truncated conjugate-gradient (tCG) method, see [ABG06], Algorithm 2, and [CGT00]. Here $\mathcal H_p$ is either the Hessian $\operatorname{Hess} f(p)$ or a linear symmetric operator on the tangent space approximating the Hessian.
Interface
Manopt.truncated_conjugate_gradient_descent
— Functiontruncated_conjugate_gradient_descent(M, f, grad_f, p; kwargs...)
truncated_conjugate_gradient_descent(M, f, grad_f, p, X; kwargs...)
truncated_conjugate_gradient_descent(M, f, grad_f, Hess_f; kwargs...)
truncated_conjugate_gradient_descent(M, f, grad_f, Hess_f, p; kwargs...)
truncated_conjugate_gradient_descent(M, f, grad_f, Hess_f, p, X; kwargs...)
truncated_conjugate_gradient_descent(M, mho::ManifoldHessianObjective, p, X; kwargs...)
truncated_conjugate_gradient_descent(M, trmo::TrustRegionModelObjective, p, X; kwargs...)
solve the trust-region subproblem
\[\begin{align*} \operatorname*{arg\,min}_{Y ∈ T_p\mathcal{M}}&\ m_p(Y) = f(p) + ⟨\operatorname{grad}f(p), Y⟩_p + \frac{1}{2} ⟨\mathcal{H}_p[Y], Y⟩_p\\ \text{such that}& \ \lVert Y \rVert_p ≤ Δ \end{align*}\]
on a manifold M by using the Steihaug-Toint truncated conjugate-gradient (tCG) method. For a description of the algorithm and theorems offering convergence guarantees, see Absil, Baker, Gallivan, FoCM, 2007, Conn, Gould, Toint, 2000.
Input
See signatures above, you can leave out only the Hessian, the vector, the point and the vector, or all 3.
M
– a manifold $\mathcal M$f
– a cost function $f: \mathcal M → ℝ$ to minimizegrad_f
– the gradient $\operatorname{grad}f: \mathcal M → T\mathcal M$ ofF
Hess_f
– (optional, cf.ApproxHessianFiniteDifference
) the hessian $\operatorname{Hess}f: T_p\mathcal M → T_p\mathcal M$, $X ↦ \operatorname{Hess}F(p)[X] = ∇_X\operatorname{grad}f(p)$p
– a point on the manifold $p ∈ \mathcal M$X
– an initial tangential vector $X ∈ T_p\mathcal M$
Instead of the three functions, you either provide a ManifoldHessianObjective
mho
which is then used to build the trust region model, or a TrustRegionModelObjective
trmo
directly.
Optional
evaluation
– (AllocatingEvaluation
) specify whether the gradient and hessian work by allocation (default) orInplaceEvaluation
in placepreconditioner
– a preconditioner for the hessian Hθ
– (1.0
) 1+θ is the superlinear convergence target rate.κ
– (0.1
) the linear convergence target rate.randomize
– set to true if the trust-region solve is initialized to a random tangent vector. This disables preconditioning.trust_region_radius
– (injectivity_radius(M)/4
) a trust-region radiusproject!
– (copyto!
) for numerical stability it is possivle to project onto the tangent space after every iteration. By default this only copies instead.stopping_criterion
– (StopAfterIteration
(manifol_dimension(M)) |
StopWhenResidualIsReducedByFactorOrPower
(;κ=κ, θ=θ) |
StopWhenCurvatureIsNegative
() |
StopWhenTrustRegionIsExceeded
() |
StopWhenModelIncreased
()
) a functor inheriting fromStoppingCriterion
indicating when to stop,
and the ones that are passed to decorate_state!
for decorators.
Output
the obtained (approximate) minimizer $Y^*$, see get_solver_return
for details
see also
Manopt.truncated_conjugate_gradient_descent!
— Functiontruncated_conjugate_gradient_descent!(M, f, grad_f, Hess_f, p, X; kwargs...)
truncated_conjugate_gradient_descent!(M, f, grad_f, p, X; kwargs...)
solve the trust-region subproblem in place of X
(and p
).
Input
M
– a manifold $\mathcal M$f
– a cost function $F: \mathcal M → ℝ$ to minimizegrad_f
– the gradient $\operatorname{grad}f: \mathcal M → T\mathcal M$ off
Hess_f
– the hessian $\operatorname{Hess}f(x): T_p\mathcal M → T_p\mathcal M$, $X ↦ \operatorname{Hess}f(p)[X]$p
– a point on the manifold $p ∈ \mathcal M$X
– an update tangential vector $X ∈ T_x\mathcal M$
For more details and all optional arguments, see truncated_conjugate_gradient_descent
.
State
Manopt.TruncatedConjugateGradientState
— TypeTruncatedConjugateGradientState <: AbstractHessianSolverState
describe the Steihaug-Toint truncated conjugate-gradient method, with
Fields
a default value is given in brackets if a parameter can be left out in initialization.
p
– (rand(M)
a point, where we use the tangent space to solve the trust-region subproblemY
– (zero_vector(M,p)
) Current iterate, whose type is also used for the other, internal, tangent vector fieldsstop
– aStoppingCriterion
.X
– the gradient $\operatorname{grad}f(p)$`δ
– the conjugate gradient search directionθ
– (1.0
) 1+θ is the superlinear convergence target rate.κ
– (0.1
) the linear convergence target rate.trust_region_radius
– (injectivity_radius(M)/4
) the trust-region radiusresidual
– the gradient of the model $m(Y)$randomize
… (false
)project!
– (copyto!
) for numerical stability it is possivle to project onto the tangent space after every iteration. By default this only copies instead.
Internal (temporary) Fields
Hδ
,HY
– temporary results of the Hessian applied toδ
andY
, respectively.δHδ
,YPδ
,δPδ
,YPδ
– temporary inner products withHδ
and preconditioned inner products.z
– the preconditioned residualz_r
- inner product of the residual andz
Constructor
TruncatedConjugateGradientState(M, p=rand(M), Y=zero_vector(M,p); kwargs...)
See also
Stopping criteria
Manopt.StopWhenResidualIsReducedByFactorOrPower
— TypeStopWhenResidualIsReducedByFactorOrPower <: StoppingCriterion
A functor for testing if the norm of residual at the current iterate is reduced either by a power of 1+θ or by a factor κ compared to the norm of the initial residual, i.e. $\Vert r_k \Vert_x \leqq \Vert r_0 \Vert_{x} \ \min \left( \kappa, \Vert r_0 \Vert_{x}^{\theta} \right)$.
Fields
κ
– the reduction factorθ
– part of the reduction powerreason
– stores a reason of stopping if the stopping criterion has one be reached, seeget_reason
.
Constructor
StopWhenResidualIsReducedByFactorOrPower(; κ=0.1, θ=1.0)
Initialize the StopWhenResidualIsReducedByFactorOrPower functor to indicate to stop after the norm of the current residual is lesser than either the norm of the initial residual to the power of 1+θ or the norm of the initial residual times κ.
See also
Manopt.StopWhenTrustRegionIsExceeded
— TypeStopWhenTrustRegionIsExceeded <: StoppingCriterion
A functor for testing if the norm of the next iterate in the Steihaug-Toint tcg method is larger than the trust-region radius, i.e. $\Vert Y_{k}^{*} \Vert_x ≧ trust_region_radius$. terminate the algorithm when the trust region has been left.
Fields
reason
– stores a reason of stopping if the stopping criterion has been reached, seeget_reason
.
Constructor
StopWhenTrustRegionIsExceeded()
initialize the StopWhenTrustRegionIsExceeded functor to indicate to stop after the norm of the next iterate is greater than the trust-region radius.
See also
Manopt.StopWhenCurvatureIsNegative
— TypeStopWhenCurvatureIsNegative <: StoppingCriterion
A functor for testing if the curvature of the model is negative, i.e. $\langle \delta_k, \operatorname{Hess}[F](\delta_k)\rangle_x \leqq 0$. In this case, the model is not strictly convex, and the stepsize as computed does not give a reduction of the model.
Fields
reason
– stores a reason of stopping if the stopping criterion has been reached, seeget_reason
.
Constructor
StopWhenCurvatureIsNegative()
See also
Manopt.StopWhenModelIncreased
— TypeStopWhenModelIncreased <: StoppingCriterion
A functor for testing if the curvature of the model value increased.
Fields
reason
– stores a reason of stopping if the stopping criterion has been reached, seeget_reason
.
Constructor
StopWhenModelIncreased()
See also
Manopt.update_stopping_criterion!
— Methodupdate_stopping_criterion!(c::StopWhenResidualIsReducedByFactorOrPower, :ResidualPower, v)
Update the residual Power θ
to v
.
Manopt.update_stopping_criterion!
— Methodupdate_stopping_criterion!(c::StopWhenResidualIsReducedByFactorOrPower, :ResidualFactor, v)
Update the residual Factor κ
to v
.
Trust region model
Manopt.TrustRegionModelObjective
— TypeTrustRegionModelObjective{O<:AbstractManifoldHessianObjective} <: AbstractManifoldSubObjective{O}
A trust region model of the form
\[ m(X) = f(p) + ⟨\operatorname{grad} f(p), X⟩_p + \frac{1}(2} ⟨\operatorname{Hess} f(p)[X], X⟩_p\]
Fields
objective
– anAbstractManifoldHessianObjective
proving $f$, its gradient and Hessian
Constructors
TrustRegionModelObjective(objective)
with either an AbstractManifoldHessianObjective
objective
or an decorator containing such an objective
Technical details
The trust_regions
solver requires the following functions of a manifold to be available
- if you do not provide a
trust_region_radius=
, theninjectivity_radius
on the manifoldM
is required. - the
norm
as well, to stop when the norm of the gradient is small, but if you implementedinner
, the norm is provided already. - A
zero_vector!
(M,X,p)
. - A
copyto!
(M, q, p)
andcopy
(M,p)
for points.
Literature
- [ABG06]
- P.-A. Absil, C. Baker and K. Gallivan. Trust-Region Methods on Riemannian Manifolds. Foundations of Computational Mathematics 7, 303–330 (2006).
- [CGT00]
- A. R. Conn, N. I. Gould and P. L. Toint. Trust Region Methods (Society for Industrial and Applied Mathematics, 2000).