The aim is to solve the trust-region subproblem

$$$\operatorname*{arg\,min}_{η ∈ T_{x}\mathcal{M}} m_{x}(η) = F(x) + ⟨\operatorname{grad}F(x), η⟩_{x} + \frac{1}{2} ⟨ \operatorname{Hess}F(x)[η], η⟩_{x}$$$$$$\text{s.t.} \; ⟨η, η⟩_{x} \leq {Δ}^2$$$

on a manifold by using the Steihaug-Toint truncated conjugate-gradient method. All terms involving the trust-region radius use an inner product w.r.t. the preconditioner; this is because the iterates grow in length w.r.t. the preconditioner, guaranteeing that we do not re-enter the trust-region.

## Initialization

Initialize $η_0 = η$ if using randomized approach and $η$ the zero tangent vector otherwise, $r_0 = \operatorname{grad}F(x)$, $z_0 = \operatorname{P}(r_0)$, $δ_0 = z_0$ and $k=0$

## Iteration

Repeat until a convergence criterion is reached

1. Set $κ = ⟨δ_k, \operatorname{Hess}F(x)[δ_k]⟩_x$, $α =\frac{⟨r_k, z_k⟩_x}{κ}$ and $⟨η_k, η_k⟩_{x}^* = ⟨η_k, \operatorname{P}(η_k)⟩_x + 2α ⟨η_k, \operatorname{P}(δ_k)⟩_{x} + {α}^2 ⟨δ_k, \operatorname{P}(δ_k)⟩_{x}$.
2. If $κ ≤ 0$ or $⟨η_k, η_k⟩_x^* ≥ Δ^2$ return $η_{k+1} = η_k + τ δ_k$ and stop.
3. Set $η_{k}^*= η_k + α δ_k$, if $⟨η_k, η_k⟩_{x} + \frac{1}{2} ⟨η_k, \operatorname{Hess}[F] (η_k)_{x}⟩_{x} ≤ ⟨η_k^*, η_k^*⟩_{x} + \frac{1}{2} ⟨η_k^*, \operatorname{Hess}[F] (η_k)_ {x}⟩_{x}$ set $η_{k+1} = η_k$ else set $η_{k+1} = η_{k}^*$.
4. Set $r_{k+1} = r_k + α \operatorname{Hess}[F](δ_k)_x$, $z_{k+1} = \operatorname{P}(r_{k+1})$, $β = \frac{⟨r_{k+1}, z_{k+1}⟩_{x}}{⟨r_k, z_k ⟩_{x}}$ and $δ_{k+1} = -z_{k+1} + β δ_k$.
5. Set $k=k+1$.

## Result

The result is given by the last computed $η_k$.

## Remarks

The $\operatorname{P}(⋅)$ denotes the symmetric, positive deﬁnite preconditioner. It is required if a randomized approach is used i.e. using a random tangent vector $η_0$ as initial vector. The idea behind it is to avoid saddle points. Preconditioning is simply a rescaling of the variables and thus a redeﬁnition of the shape of the trust region. Ideally $\operatorname{P}(⋅)$ is a cheap, positive approximation of the inverse of the Hessian of $F$ at $x$. On default, the preconditioner is just the identity.

To step number 2: Obtain $τ$ from the positive root of $\left\lVert η_k + τ δ_k \right\rVert_{\operatorname{P}, x} = Δ$ what becomes after the conversion of the equation to

$$$τ = \frac{-⟨η_k, \operatorname{P}(δ_k)⟩_{x} + \sqrt{⟨η_k, \operatorname{P}(δ_k)⟩_{x}^{2} + ⟨δ_k, \operatorname{P}(δ_k)⟩_{x} ( Δ^2 - ⟨η_k, \operatorname{P}(η_k)⟩_{x})}} {⟨δ_k, \operatorname{P}(δ_k)⟩_{x}}.$$$

It can occur that $⟨δ_k, \operatorname{Hess}[F] (δ_k)_{x}⟩_{x} = κ ≤ 0$ at iteration $k$. In this case, the model is not strictly convex, and the stepsize $α =\frac{⟨r_k, z_k⟩_{x}} {κ}$ computed in step 1. does not give a reduction in the modelfunction $m_x(⋅)$. Indeed, $m_x(⋅)$ is unbounded from below along the line $η_k + α δ_k$. If our aim is to minimize the model within the trust-region, it makes far more sense to reduce $m_x(⋅)$ along $η_k + α δ_k$ as much as we can while staying within the trust-region, and this means moving to the trust-region boundary along this line. Thus when $κ ≤ 0$ at iteration k, we replace $α = \frac{⟨r_k, z_k⟩_{x}}{κ}$ with $τ$ described as above. The other possibility is that $η_{k+1}$ would lie outside the trust-region at iteration k (i.e. $⟨η_k, η_k⟩_{x}^{* } ≥ {Δ}^2$ what can be identified with the norm of $η_{k+1}$). In particular, when $\operatorname{Hess}[F] (⋅)_{x}$ is positive deﬁnite and $η_{k+1}$ lies outside the trust region, the solution to the trust-region problem must lie on the trust-region boundary. Thus, there is no reason to continue with the conjugate gradient iteration, as it stands, as subsequent iterates will move further outside the trust-region boundary. A sensible strategy, just as in the case considered above, is to move to the trust-region boundary by ﬁnding $τ$.

## Interface

Manopt.truncated_conjugate_gradient_descentFunction
truncated_conjugate_gradient_descent(M, F, gradF, x, η, HessF, trust_region_radius)

solve the trust-region subproblem

$$$\operatorname*{arg\,min}_{η ∈ T_xM} m_x(η) \quad\text{where} m_x(η) = F(x) + ⟨\operatorname{grad}F(x),η⟩_x + \frac{1}{2}⟨\operatorname{Hess}F(x)[η],η⟩_x,$$$$$$\text{such that}\quad ⟨η,η⟩_x ≤ Δ^2$$$

with the truncated_conjugate_gradient_descent. For a description of the algorithm and theorems offering convergence guarantees, see the reference:

Input

• M – a manifold $\mathcal M$
• F – a cost function $F: \mathcal M → ℝ$ to minimize
• gradF – the gradient $\operatorname{grad}F: \mathcal M → T\mathcal M$ of F
• x – a point on the manifold $x ∈ \mathcal M$
• η – an update tangential vector $η ∈ T_x\mathcal M$
• HessF – the hessian $\operatorname{Hess}F(x): T_x\mathcal M → T_x\mathcal M$, $X ↦ \operatoname{Hess}F(x)[X] = ∇_ξ\operatorname{grad}f(x)$

Optional

• evaluation – (AllocatingEvaluation) specify whether the gradient and hessian work by allocation (default) or MutatingEvaluation in place
• preconditioner – a preconditioner for the hessian H
• θ – (1.0) 1+θ is the superlinear convergence target rate. The algorithm will terminate early if the residual was reduced by a power of 1+theta.
• κ – (0.1) the linear convergence target rate: algorithm will terminate early if the residual was reduced by a factor of kappa.
• randomize – set to true if the trust-region solve is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.
• trust_region_radius – (injectivity_radius(M)/4) a trust-region radius
• project_vector! : (copyto!) specify a projection operation for tangent vectors for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.
• stopping_criterion – (StopWhenAny, StopAfterIteration, StopIfResidualIsReducedByFactor, StopIfResidualIsReducedByPower, StopWhenCurvatureIsNegative, StopWhenTrustRegionIsExceeded ) a functor inheriting from StoppingCriterion indicating when to stop, where for the default, the maximal number of iterations is set to the dimension of the manifold, the power factor is θ, the reduction factor is κ. .
• return_options – (false) – if activated, the extended result, i.e. the complete Options re returned. This can be used to access recorded values. If set to false (default) just the optimal value x_opt is returned

and the ones that are passed to decorate_options for decorators.

Output

• η – an approximate solution of the trust-region subproblem in $T_{x}\mathcal M$.

OR

• options - the options returned by the solver (see return_options)

trust_regions

source
Manopt.truncated_conjugate_gradient_descent!Function
truncated_conjugate_gradient_descent!(M, F, gradF, x, η, HessF, trust_region_radius; kwargs...)

solve the trust-region subproblem in place of x.

Input

Input

• M – a manifold $\mathcal M$
• F – a cost function $F: \mathcal M → ℝ$ to minimize
• gradF – the gradient $\operatorname{grad}F: \mathcal M → T\mathcal M$ of F
• HessF – the hessian $\operatorname{Hess}F(x): T_x\mathcal M → T_x\mathcal M$, $X ↦ \operatoname{Hess}F(x)[X] = ∇_ξ\operatorname{grad}f(x)$
• x – a point on the manifold $x ∈ \mathcal M$
• η – an update tangential vector $η ∈ T_x\mathcal M$

For more details and all optional arguments, see truncated_conjugate_gradient_descent.

source

## Options

Manopt.TruncatedConjugateGradientOptionsType
TruncatedConjugateGradientOptions <: AbstractHessianOptions

describe the Steihaug-Toint truncated conjugate-gradient method, with

Fields

a default value is given in brackets if a parameter can be left out in initialization.

• x : a point, where the trust-region subproblem needs to be solved
• η : a tangent vector (called update vector), which solves the trust-region subproblem after successful calculation by the algorithm
• stop : a StoppingCriterion.
• gradient : the gradient at the current iterate
• δ : search direction
• trust_region_radius : (injectivity_radius(M)/4) the trust-region radius
• residual : the gradient
• randomize : indicates if the trust-region solve and so the algorithm is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.
• project_vector! : (copyto!) specify a projection operation for tangent vectors for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.

Constructor

TruncatedConjugateGradientOptions(M, p, x, η;
randomize=false,
θ=1.0,
κ=0.1,
project_vector! = copyto!,
)

and a slightly involved stopping_criterion

source

Manopt.StopIfResidualIsReducedByPowerType
StopIfResidualIsReducedByPower <: StoppingCriterion

A functor for testing if the norm of residual at the current iterate is reduced by a power of 1+θ compared to the norm of the initial residual, i.e. $\Vert r_k \Vert_x \leqq \Vert r_0 \Vert_{x}^{1+\theta}$. In this case the algorithm reached superlinear convergence.

Fields

Constructor

StopIfResidualIsReducedByPower(iRN, θ)

initialize the StopIfResidualIsReducedByFactor functor to indicate to stop after the norm of the current residual is lesser than the norm of the initial residual iRN to the power of 1+θ.

source
Manopt.StopIfResidualIsReducedByFactorType
StopIfResidualIsReducedByFactor <: StoppingCriterion

A functor for testing if the norm of residual at the current iterate is reduced by a factor compared to the norm of the initial residual, i.e. $\Vert r_k \Vert_x \leqq κ \Vert r_0 \Vert_x$. In this case the algorithm reached linear convergence.

Fields

Constructor

StopIfResidualIsReducedByFactor(iRN, κ)

initialize the StopIfResidualIsReducedByFactor functor to indicate to stop after the norm of the current residual is lesser than the norm of the initial residual iRN times κ.

source
Manopt.StopWhenTrustRegionIsExceededType
StopWhenTrustRegionIsExceeded <: StoppingCriterion

A functor for testing if the norm of the next iterate in the Steihaug-Toint tcg mehtod is larger than the trust-region radius, i.e. $\Vert η_{k}^{*} \Vert_x ≧ trust_region_radius$. terminate the algorithm when the trust region has been left.

Fields

• reason – stores a reason of stopping if the stopping criterion has one be reached, see get_reason.
• storage – stores the necessary parameters η, δ, residual to check the criterion.

Constructor

StopWhenTrustRegionIsExceeded([a])

initialize the StopWhenTrustRegionIsExceeded functor to indicate to stop after the norm of the next iterate is greater than the trust-region radius using the StoreOptionsAction a, which is initialized to store :η, :δ, :residual by default.

source
Manopt.StopWhenCurvatureIsNegativeType
StopWhenCurvatureIsNegative <: StoppingCriterion

A functor for testing if the curvature of the model is negative, i.e. $\langle \delta_k, \operatorname{Hess}[F](\delta_k)\rangle_x \leqq 0$. In this case, the model is not strictly convex, and the stepsize as computed does not give a reduction of the model.

Fields

• reason – stores a reason of stopping if the stopping criterion has one be reached, see get_reason.

Constructor

StopWhenCurvatureIsNegative()

source
Manopt.StopWhenModelIncreasedType
StopWhenModelIncreased <: StoppingCriterion

A functor for testing if the curvature of the model value increased.

Fields

• reason – stores a reason of stopping if the stopping criterion has one be reached, see get_reason.

Constructor

StopWhenModelIncreased()