# The Riemannian trust regions solver

Minimize a function

$$$\operatorname*{\arg\,min}_{p ∈ \mathcal{M}}\ f(p)$$$

by using the Riemannian trust-regions solver following [ABG06] a model is build by lifting the objective at the $k$th iterate $p_k$ by locally mapping the cost function $f$ to the tangent space as $f_k: T_{p_k}\mathcal M → ℝ$ as $f_k(X) = f(\operatorname{retr}_{p_k}(X))$. The trust region subproblem is then defined as

$$$\operatorname*{arg\,min}_{X ∈ T_{p_k}\mathcal M}\ m_k(X),$$$

where

\begin{align*} m_k&: T_{p_K}\mathcal M → ℝ,\\ m_k(X) &= f(p_k) + ⟨\operatorname{grad} f(p_k), X⟩_{p_k} + \frac{1}{2}\langle \mathcal H_k(X),X⟩_{p_k}\\ \text{such that}&\ \lVert X \rVert_{p_k} ≤ Δ_k. \end{align*}

Here $Δ_k$ is a trust region radius, that is adapted every iteration, and $\mathcal H_k$ is some symmetric linear operator that approximates the Hessian $\operatorname{Hess} f$ of $f$.

## Interface

Manopt.trust_regionsFunction
trust_regions(M, f, grad_f, hess_f, p=rand(M))
trust_regions(M, f, grad_f, p=rand(M))

run the Riemannian trust-regions solver for optimization on manifolds to minimize f, see on [ABG06, CGT00].

For the case that no Hessian is provided, the Hessian is computed using finite differences, see ApproxHessianFiniteDifference. For solving the inner trust-region subproblem of finding an update-vector, by default the truncated_conjugate_gradient_descent is used.

Input

• M: a manifold $\mathcal M$
• f: a cost function $f : \mathcal M → ℝ$ to minimize
• grad_f: the gradient $\operatorname{grad}F : \mathcal M → T \mathcal M$ of $F$
• Hess_f: (optional), the Hessian $\operatorname{Hess}F(x): T_x\mathcal M → T_x\mathcal M$, $X ↦ \operatorname{Hess}F(x)[X] = ∇_ξ\operatorname{grad}f(x)$
• p: (rand(M)) an initial value $x ∈ \mathcal M$

Keyword arguments

• acceptance_rate: Accept/reject threshold: if ρ (the performance ratio for the iterate) is at least the acceptance rate ρ', the candidate is accepted. This value should be between $0$ and $\frac{1}{4}$
• augmentation_threshold: (0.75) trust-region augmentation threshold: if ρ is larger than this threshold, a solution is on the trust region boundary and negative curvature, and the radius is extended (augmented)
• augmentation_factor: (2.0) trust-region augmentation factor
• evaluation: (AllocatingEvaluation) specify whether the gradient and Hessian work by allocation (default) or InplaceEvaluation in place
• κ: (0.1) the linear convergence target rate of the tCG method truncated_conjugate_gradient_descent, and is used in a stopping criterion therein
• max_trust_region_radius: the maximum trust-region radius
• preconditioner: a preconditioner (a symmetric, positive definite operator that should approximate the inverse of the Hessian)
• project!; (copyto!) specify a projection operation for tangent vectors within the subsolver for numerical stability. The required form is (M, Y, p, X) -> ... working in place of Y.
• randomize; set to true if the trust-region solve is to be initiated with a random tangent vector and no preconditioner is used.
• ρ_regularization: (1e3) regularize the performance evaluation $ρ$ to avoid numerical inaccuracies.
• reduction_factor: (0.25) trust-region reduction factor
• reduction_threshold: (0.1) trust-region reduction threshold: if ρ is below this threshold, the trust region radius is reduced by reduction_factor.
• retraction (default_retraction_method(M, typeof(p))) a retraction to use
• stopping_criterion: (StopAfterIteration(1000) |StopWhenGradientNormLess(1e-6)) a functor inheriting from StoppingCriterion indicating when to stop.
• sub_kwargs: keyword arguments passed to the sub state and used to decorate the sub options
• sub_stopping_criterion: a stopping criterion for the sub solver, uses the same standard as TCG.
• sub_problem: (DefaultManoptProblem(M,ConstrainedManifoldObjective(subcost, subgrad; evaluation=evaluation))) problem for the subsolver
• sub_state: (QuasiNewtonState) using QuasiNewtonLimitedMemoryDirectionUpdate with InverseBFGS and sub_stopping_criterion as a stopping criterion. See also sub_kwargs.
• θ: (1.0) 1+θ is the superlinear convergence target rate of the tCG-method truncated_conjugate_gradient_descent, and is used in a stopping criterion therein
• trust_region_radius: the initial trust-region radius

For the case that no Hessian is provided, the Hessian is computed using finite difference, see ApproxHessianFiniteDifference.

Output

the obtained (approximate) minimizer $p^*$, see get_solver_return for details

truncated_conjugate_gradient_descent

source
Manopt.trust_regions!Function
trust_regions!(M, f, grad_f, Hess_f, p; kwargs...)
trust_regions!(M, f, grad_f, p; kwargs...)

evaluate the Riemannian trust-regions solver in place of p.

Input

• M: a manifold $\mathcal M$
• f: a cost function $f: \mathcal M → ℝ$ to minimize
• grad_f: the gradient $\operatorname{grad}f: \mathcal M → T \mathcal M$ of $F$
• Hess_f: (optional) the Hessian $\operatorname{Hess} f$
• p: an initial value $p ∈ \mathcal M$

For the case that no Hessian is provided, the Hessian is computed using finite difference, see ApproxHessianFiniteDifference.

for more details and all options, see trust_regions

source

## State

Manopt.TrustRegionsStateType
TrustRegionsState <: AbstractHessianSolverState

Store the state of the trust-regions solver.

Fields

All the following fields (besides p) can be set by specifying them as keywords.

• acceptance_rate: (0.1) a lower bound of the performance ratio for the iterate that decides if the iteration is accepted or not.
• max_trust_region_radius: (sqrt(manifold_dimension(M))) the maximum trust-region radius
• p: (rand(M) if a manifold is provided) the current iterate
• project!: (copyto!) specify a projection operation for tangent vectors for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is performed, set it to project! to activate projection.
• stop: (StopAfterIteration(1000) |StopWhenGradientNormLess(1e-6))
• randomize: (false) indicates if the trust-region solve is to be initiated with a random tangent vector. If set to true, no preconditioner is used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.
• ρ_regularization: (10000.0) regularize the model fitness $ρ$ to avoid division by zero
• sub_problem: an AbstractManoptProblem problem or a function (M, p, X) -> q or (M, q, p, X) for the a closed form solution of the sub problem
• sub_state: (TruncatedConjugateGradientState(M, p, X))
• σ: (0.0 or 1e-6 depending on randomize) Gaussian standard deviation when creating the random initial tangent vector
• trust_region_radius: (max_trust_region_radius / 8) the (initial) trust-region radius
• X: (zero_vector(M,p)) the current gradient grad_f(p) Use this default to specify the type of tangent vector to allocate also for the internal (tangent vector) fields.

Internal fields

• HX, HY, HZ: interim storage (to avoid allocation) of \operatorname{Hess} f(p)[\cdot] of X, Y, Z
• Y: the solution (tangent vector) of the subsolver
• Z: the Cauchy point (only used if random is activated)

Constructors

All the following constructors have the fields as keyword arguments with the defaults given in brackets. If no initial point p is provided, p=rand(M) is used

TrustRegionsState(M, mho; kwargs...)
TrustRegionsState(M, p, mho; kwargs...)

A trust region state, where the sub problem is set to a DefaultManoptProblem on the tangent space using the TrustRegionModelObjective to be solved with truncated_conjugate_gradient_descent! or in other words the sub state is set to TruncatedConjugateGradientState.

TrustRegionsState(M, sub_problem, sub_state; kwargs...)
TrustRegionsState(M, p, sub_problem, sub_state; kwargs...)

A trust region state, where the sub problem is solved using a AbstractManoptProblem sub_problem and an AbstractManoptSolverState sub_state.

TrustRegionsState(M, f::Function; evaluation=AllocatingEvaluation, kwargs...)
TrustRegionsState(M, p, f; evaluation=AllocatingEvaluation, kwargs...)

A trust region state, where the sub problem is solved in closed form by a function f(M, p, Y, Δ), where p is the current iterate, Y the initial tangent vector at p and Δ the current trust region radius.

source

## Approximation of the Hessian

Several different methods to approximate the Hessian are available.

Manopt.ApproxHessianFiniteDifferenceType
ApproxHessianFiniteDifference{E, P, T, G, RTR,, VTR, R <: Real} <: AbstractApproxHessian

A functor to approximate the Hessian by a finite difference of gradient evaluation.

Given a point p and a direction X and the gradient $\operatorname{grad}F: \mathcal M → T\mathcal M$ of a function $F$ the Hessian is approximated as follows: let $c$ be a stepsize, $X∈ T_p\mathcal M$ a tangent vector and $q = \operatorname{retr}_p(\frac{c}{\lVert X \rVert_p}X)$ be a step in direction $X$ of length $c$ following a retraction Then the Hessian is approximated by the finite difference of the gradients, where $\mathcal T_{\cdot\gets\cdot}$ is a vector transport.

$$$\operatorname{Hess}F(p)[X] ≈ \frac{\lVert X \rVert_p}{c}\Bigl( \mathcal T_{p\gets q}\bigr(\operatorname{grad}F(q)\bigl) - \operatorname{grad}F(p) \Bigl)$$$

Fields

• gradient!!: the gradient function (either allocating or mutating, see evaluation parameter)
• step_length: a step length for the finite difference
• retraction_method: a retraction to use
• vector_transport_method: a vector transport to use

Internal temporary fields

• grad_tmp: a temporary storage for the gradient at the current p
• grad_dir_tmp: a temporary storage for the gradient at the current p_dir
• p_dir::P: a temporary storage to the forward direction (or the $q$ in the formula)

Constructor

ApproximateFiniteDifference(M, p, grad_f; kwargs...)

Keyword arguments

• evaluation: (AllocatingEvaluation) whether the gradient is given as an allocation function or an in-place (InplaceEvaluation).
• steplength: ($2^{-14}$) step length $c$ to approximate the gradient evaluations
• retraction_method: (default_retraction_method(M, typeof(p))) a retraction(M, p, X) to use in the approximation.
• vector_transport_method: (default_vector_transport_method(M, typeof(p))) a vector transport to use
source
Manopt.ApproxHessianSymmetricRankOneType
ApproxHessianSymmetricRankOne{E, P, G, T, B<:AbstractBasis{ℝ}, VTR, R<:Real} <: AbstractApproxHessian

A functor to approximate the Hessian by the symmetric rank one update.

Fields

• gradient!! the gradient function (either allocating or mutating, see evaluation parameter).
• ν a small real number to ensure that the denominator in the update does not become too small and thus the method does not break down.
• vector_transport_method a vector transport to use.

Internal temporary fields

• p_tmp a temporary storage the current point p.
• grad_tmp a temporary storage for the gradient at the current p.
• matrix a temporary storage for the matrix representation of the approximating operator.
• basis a temporary storage for an orthonormal basis at the current p.

Constructor

ApproxHessianSymmetricRankOne(M, p, gradF; kwargs...)

Keyword arguments

• initial_operator (Matrix{Float64}(I, manifold_dimension(M), manifold_dimension(M))) the matrix representation of the initial approximating operator.
• basis (DefaultOrthonormalBasis()) an orthonormal basis in the tangent space of the initial iterate p.
• nu (-1)
• evaluation (AllocatingEvaluation) whether the gradient is given as an allocation function or an in-place (InplaceEvaluation).
• vector_transport_method (ParallelTransport()) vector transport $\mathcal T_{\cdot\gets\cdot}$ to use.
source
Manopt.ApproxHessianBFGSType
ApproxHessianBFGS{E, P, G, T, B<:AbstractBasis{ℝ}, VTR, R<:Real} <: AbstractApproxHessian

A functor to approximate the Hessian by the BFGS update.

Fields

• gradient!! the gradient function (either allocating or mutating, see evaluation parameter).
• scale
• vector_transport_method a vector transport to use.

Internal temporary fields

• p_tmp a temporary storage the current point p.
• grad_tmp a temporary storage for the gradient at the current p.
• matrix a temporary storage for the matrix representation of the approximating operator.
• basis a temporary storage for an orthonormal basis at the current p.

Constructor

ApproxHessianBFGS(M, p, gradF; kwargs...)

Keyword arguments

• initial_operator (Matrix{Float64}(I, manifold_dimension(M), manifold_dimension(M))) the matrix representation of the initial approximating operator.
• basis (DefaultOrthonormalBasis()) an orthonormal basis in the tangent space of the initial iterate p.
• nu (-1)
• evaluation (AllocatingEvaluation) whether the gradient is given as an allocation function or an in-place (InplaceEvaluation).
• vector_transport_method (ParallelTransport()) vector transport $\mathcal T_{\cdot\gets\cdot}$ to use.
source

as well as their (non-exported) common supertype

## Technical details

The trust_regions` solver requires the following functions of a manifold to be available

## Literature

[ABG06]
P.-A. Absil, C. Baker and K. Gallivan. Trust-Region Methods on Riemannian Manifolds. Foundations of Computational Mathematics 7, 303–330 (2006).
[CGT00]
A. R. Conn, N. I. Gould and P. L. Toint. Trust Region Methods (Society for Industrial and Applied Mathematics, 2000).