# The Riemannian trust regions solver

Minimize a function

\[\operatorname*{\arg\,min}_{p ∈ \mathcal{M}}\ f(p)\]

by using the Riemannian trust-regions solver following [ABG06] a model is build by lifting the objective at the $k$th iterate $p_k$ by locally mapping the cost function $f$ to the tangent space as $f_k: T_{p_k}\mathcal M → ℝ$ as $f_k(X) = f(\operatorname{retr}_{p_k}(X))$. The trust region subproblem is then defined as

\[\operatorname*{arg\,min}_{X ∈ T_{p_k}\mathcal M}\ m_k(X),\]

where

\[\begin{align*} m_k&: T_{p_K}\mathcal M → ℝ,\\ m_k(X) &= f(p_k) + ⟨\operatorname{grad} f(p_k), X⟩_{p_k} + \frac{1}{2}\langle \mathcal H_k(X),X⟩_{p_k}\\ \text{such that}&\ \lVert X \rVert_{p_k} ≤ Δ_k. \end{align*}\]

Here $Δ_k$ is a trust region radius, that is adapted every iteration, and $\mathcal H_k$ is some symmetric linear operator that approximates the Hessian $\operatorname{Hess} f$ of $f$.

## Interface

`Manopt.trust_regions`

— Function```
trust_regions(M, f, grad_f, Hess_f, p=rand(M); kwargs...)
trust_regions(M, f, grad_f, p=rand(M); kwargs...)
trust_regions!(M, f, grad_f, Hess_f, p; kwargs...)
trust_regions!(M, f, grad_f, p; kwargs...)
```

run the Riemannian trust-regions solver for optimization on manifolds to minimize `f`

, see on [ABG06, CGT00].

For the case that no Hessian is provided, the Hessian is computed using finite differences, see `ApproxHessianFiniteDifference`

. For solving the inner trust-region subproblem of finding an update-vector, by default the `truncated_conjugate_gradient_descent`

is used.

**Input**

`M::`

`AbstractManifold`

`: a Riemannian manifold $\mathcal M$`

`f`

: a cost function $f: \mathcal M→ ℝ$ implemented as`(M, p) -> v`

`grad_f`

: the (Riemannian) gradient $\operatorname{grad}f$: \mathcal M → T_{p}\mathcal M of f as a function`(M, p) -> X`

or a function`(M, X, p) -> X`

computing`X`

in-place`Hess_f`

: the (Riemannian) Hessian $\operatorname{Hess}f$: T*{p}\mathcal M → T*{p}\mathcal M of f as a function`(M, p, X) -> Y`

or a function`(M, Y, p, X) -> Y`

computing`Y`

in-place`p`

: a point on the manifold $\mathcal M$

**Keyword arguments**

`acceptance_rate`

: accept/reject threshold: if ρ (the performance ratio for the iterate) is at least the acceptance rate ρ', the candidate is accepted. This value should be between $0$ and $rac{1}{4}$`augmentation_threshold=0.75`

: trust-region augmentation threshold: if ρ is larger than this threshold, a solution is on the trust region boundary and negative curvature, and the radius is extended (augmented)`augmentation_factor=2.0`

: trust-region augmentation factor`evaluation=`

`AllocatingEvaluation`

`()`

: specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (`AllocatingEvaluation`

) or whether they modify their input argument to return the result therein (`InplaceEvaluation`

). Since usually the first argument is the manifold, the modified argument is the second.`κ=0.1`

: the linear convergence target rate of the tCG method`truncated_conjugate_gradient_descent`

, and is used in a stopping criterion therein`max_trust_region_radius`

: the maximum trust-region radius`preconditioner`

: a preconditioner for the Hessian H. This is either an allocating function`(M, p, X) -> Y`

or an in-place function`(M, Y, p, X) -> Y`

, see`evaluation`

, and by default set to the identity.`project!=copyto!`

: for numerical stability it is possible to project onto the tangent space after every iteration. the function has to work inplace of`Y`

, that is`(M, Y, p, X) -> Y`

, where`X`

and`Y`

can be the same memory.`randomize=false`

: indicate whether`X`

is initialised to a random vector or not. This disables preconditioning.`ρ_regularization=1e3`

: regularize the performance evaluation $ρ$ to avoid numerical inaccuracies.`reduction_factor=0.25`

: trust-region reduction factor`reduction_threshold=0.1`

: trust-region reduction threshold: if ρ is below this threshold, the trust region radius is reduced by`reduction_factor`

.`retraction_method=`

`default_retraction_method`

`(M, typeof(p))`

: a retraction $\operatorname{retr}$ to use, see the section on retractions`stopping_criterion=`

`StopAfterIteration`

`(1000)`

`|`

`StopWhenGradientNormLess`

`(1e-6)`

: a functor indicating that the stopping criterion is fulfilled`sub_kwargs=`

`(;)`

: a named tuple of keyword arguments that are passed to`decorate_objective!`

of the sub solvers objective, the`decorate_state!`

of the subsovlers state, and the sub state constructor itself.`sub_stopping_criterion=`

( see`truncated_conjugate_gradient_descent`

): a functor indicating that the stopping criterion is fulfilled`sub_problem=`

`DefaultManoptProblem`

`(M,`

`ConstrainedManifoldObjective`

`(subcost, subgrad; evaluation=evaluation))`

: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.`sub_state=`

`QuasiNewtonState`

: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function. where`QuasiNewtonLimitedMemoryDirectionUpdate`

with`InverseBFGS`

is used`θ=1.0`

: the superlinear convergence target rate of $1+θ$ of the tCG-method`truncated_conjugate_gradient_descent`

, and is used in a stopping criterion therein`trust_region_radius=`

`injectivity_radius`

`(M) / 4`

: the initial trust-region radius

For the case that no Hessian is provided, the Hessian is computed using finite difference, see `ApproxHessianFiniteDifference`

.

All other keyword arguments are passed to `decorate_state!`

for state decorators or `decorate_objective!`

for objective decorators, respectively.

**Output**

The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see `get_solver_return`

for details, especially the `return_state=`

keyword.

**See also**

`Manopt.trust_regions!`

— Function```
trust_regions(M, f, grad_f, Hess_f, p=rand(M); kwargs...)
trust_regions(M, f, grad_f, p=rand(M); kwargs...)
trust_regions!(M, f, grad_f, Hess_f, p; kwargs...)
trust_regions!(M, f, grad_f, p; kwargs...)
```

run the Riemannian trust-regions solver for optimization on manifolds to minimize `f`

, see on [ABG06, CGT00].

For the case that no Hessian is provided, the Hessian is computed using finite differences, see `ApproxHessianFiniteDifference`

. For solving the inner trust-region subproblem of finding an update-vector, by default the `truncated_conjugate_gradient_descent`

is used.

**Input**

`M::`

`AbstractManifold`

`: a Riemannian manifold $\mathcal M$`

`f`

: a cost function $f: \mathcal M→ ℝ$ implemented as`(M, p) -> v`

`grad_f`

: the (Riemannian) gradient $\operatorname{grad}f$: \mathcal M → T_{p}\mathcal M of f as a function`(M, p) -> X`

or a function`(M, X, p) -> X`

computing`X`

in-place`Hess_f`

: the (Riemannian) Hessian $\operatorname{Hess}f$: T*{p}\mathcal M → T*{p}\mathcal M of f as a function`(M, p, X) -> Y`

or a function`(M, Y, p, X) -> Y`

computing`Y`

in-place`p`

: a point on the manifold $\mathcal M$

**Keyword arguments**

`acceptance_rate`

: accept/reject threshold: if ρ (the performance ratio for the iterate) is at least the acceptance rate ρ', the candidate is accepted. This value should be between $0$ and $rac{1}{4}$`augmentation_threshold=0.75`

: trust-region augmentation threshold: if ρ is larger than this threshold, a solution is on the trust region boundary and negative curvature, and the radius is extended (augmented)`augmentation_factor=2.0`

: trust-region augmentation factor`evaluation=`

`AllocatingEvaluation`

`()`

: specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (`AllocatingEvaluation`

) or whether they modify their input argument to return the result therein (`InplaceEvaluation`

). Since usually the first argument is the manifold, the modified argument is the second.`κ=0.1`

: the linear convergence target rate of the tCG method`truncated_conjugate_gradient_descent`

, and is used in a stopping criterion therein`max_trust_region_radius`

: the maximum trust-region radius`preconditioner`

: a preconditioner for the Hessian H. This is either an allocating function`(M, p, X) -> Y`

or an in-place function`(M, Y, p, X) -> Y`

, see`evaluation`

, and by default set to the identity.`project!=copyto!`

: for numerical stability it is possible to project onto the tangent space after every iteration. the function has to work inplace of`Y`

, that is`(M, Y, p, X) -> Y`

, where`X`

and`Y`

can be the same memory.`randomize=false`

: indicate whether`X`

is initialised to a random vector or not. This disables preconditioning.`ρ_regularization=1e3`

: regularize the performance evaluation $ρ$ to avoid numerical inaccuracies.`reduction_factor=0.25`

: trust-region reduction factor`reduction_threshold=0.1`

: trust-region reduction threshold: if ρ is below this threshold, the trust region radius is reduced by`reduction_factor`

.`retraction_method=`

`default_retraction_method`

`(M, typeof(p))`

: a retraction $\operatorname{retr}$ to use, see the section on retractions`stopping_criterion=`

`StopAfterIteration`

`(1000)`

`|`

`StopWhenGradientNormLess`

`(1e-6)`

: a functor indicating that the stopping criterion is fulfilled`sub_kwargs=`

`(;)`

: a named tuple of keyword arguments that are passed to`decorate_objective!`

of the sub solvers objective, the`decorate_state!`

of the subsovlers state, and the sub state constructor itself.`sub_stopping_criterion=`

( see`truncated_conjugate_gradient_descent`

): a functor indicating that the stopping criterion is fulfilled`sub_problem=`

`DefaultManoptProblem`

`(M,`

`ConstrainedManifoldObjective`

`(subcost, subgrad; evaluation=evaluation))`

: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.`sub_state=`

`QuasiNewtonState`

: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function. where`QuasiNewtonLimitedMemoryDirectionUpdate`

with`InverseBFGS`

is used`θ=1.0`

: the superlinear convergence target rate of $1+θ$ of the tCG-method`truncated_conjugate_gradient_descent`

, and is used in a stopping criterion therein`trust_region_radius=`

`injectivity_radius`

`(M) / 4`

: the initial trust-region radius

For the case that no Hessian is provided, the Hessian is computed using finite difference, see `ApproxHessianFiniteDifference`

.

All other keyword arguments are passed to `decorate_state!`

for state decorators or `decorate_objective!`

for objective decorators, respectively.

**Output**

The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see `get_solver_return`

for details, especially the `return_state=`

keyword.

**See also**

## State

`Manopt.TrustRegionsState`

— Type`TrustRegionsState <: AbstractHessianSolverState`

Store the state of the trust-regions solver.

**Fields**

`acceptance_rate`

: a lower bound of the performance ratio for the iterate that decides if the iteration is accepted or not.`HX`

,`HY`

,`HZ`

: interim storage (to avoid allocation) of ``\operatorname{Hess} f(p)[⋅]`

of`X`

,`Y`

,`Z`

`max_trust_region_radius`

: the maximum trust-region radius`p::P`

: a point on the manifold $\mathcal M$storing the current iterate`project!`

: for numerical stability it is possible to project onto the tangent space after every iteration. the function has to work inplace of`Y`

, that is`(M, Y, p, X) -> Y`

, where`X`

and`Y`

can be the same memory.`stop::StoppingCriterion`

: a functor indicating that the stopping criterion is fulfilled`randomize`

: indicate whether`X`

is initialised to a random vector or not`ρ_regularization`

: regularize the model fitness $ρ$ to avoid division by zero`sub_problem::Union{AbstractManoptProblem, F}`

: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.`sub_state::Union{AbstractManoptProblem, F}`

: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.`σ`

: Gaussian standard deviation when creating the random initial tangent vector This field has no effect, when`randomize`

is false.`trust_region_radius`

: the trust-region radius`X::T`

: a tangent vector at the point $p$ on the manifold $\mathcal M$`Y`

: the solution (tangent vector) of the subsolver`Z`

: the Cauchy point (only used if random is activated)

**Constructors**

```
TrustRegionsState(M, mho::AbstractManifoldHessianObjective; kwargs...)
TrustRegionsState(M, sub_problem, sub_state; kwargs...)
TrustRegionsState(M, sub_problem; evaluation=AllocatingEvaluation(), kwargs...)
```

create a trust region state.

- given a
`AbstractManifoldHessianObjective`

`mho`

, the default sub solver, a`TruncatedConjugateGradientState`

with`mho`

used to define the problem on a tangent space is created - given a
`sub_problem`

and an`evaluation=`

keyword, the sub problem solver is assumed to be the closed form solution, where`evaluation`

determines how to call the sub function.

**Input**

`M::`

`AbstractManifold`

`: a Riemannian manifold $\mathcal M$`

`sub_problem`

: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.`sub_state`

: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.

**Keyword arguments**

`acceptance_rate=0.1`

`max_trust_region_radius=sqrt(manifold_dimension(M))`

`p=`

`rand`

`(M)`

: a point on the manifold $\mathcal M$to specify the initial value`project!=copyto!`

`stopping_criterion=`

`StopAfterIteration`

`(1000)`

`|`

`StopWhenGradientNormLess`

`(1e-6)`

: a functor indicating that the stopping criterion is fulfilled`randomize=false`

`ρ_regularization=10000.0`

`θ=1.0`

`trust_region_radius=max_trust_region_radius / 8`

`X=`

`zero_vector`

`(M, p)`

: a tangent vector at the point $p$ on the manifold $\mathcal M$to specify the representation of a tangent vector

**See also**

## Approximation of the Hessian

Several different methods to approximate the Hessian are available.

`Manopt.ApproxHessianFiniteDifference`

— Type`ApproxHessianFiniteDifference{E, P, T, G, RTR, VTR, R <: Real} <: AbstractApproxHessian`

A functor to approximate the Hessian by a finite difference of gradient evaluation.

Given a point `p`

and a direction `X`

and the gradient $\operatorname{grad} f(p)$ of a function $f$ the Hessian is approximated as follows: let $c$ be a stepsize, $X ∈ T_{p}\mathcal M$ a tangent vector and $q = \operatorname{retr}_p(\frac{c}{\lVert X \rVert_p}X)$ be a step in direction $X$ of length $c$ following a retraction Then the Hessian is approximated by the finite difference of the gradients, where $\mathcal T_{⋅←⋅}$ is a vector transport.

\[\operatorname{Hess}f(p)[X] ≈ \frac{\lVert X \rVert_p}{c}\Bigl( \mathcal T_{p\gets q}\bigr(\operatorname{grad}f(q)\bigl) - \operatorname{grad}f(p) \Bigl)\]

**Fields**

`gradient!!`

: the gradient function (either allocating or mutating, see`evaluation`

parameter)`step_length`

: a step length for the finite difference`retraction_method=`

`default_retraction_method`

`(M, typeof(p))`

: a retraction $\operatorname{retr}$ to use, see the section on retractions`vector_transport_method=`

`default_vector_transport_method`

`(M, typeof(p))`

: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports

**Internal temporary fields**

`grad_tmp`

: a temporary storage for the gradient at the current`p`

`grad_dir_tmp`

: a temporary storage for the gradient at the current`p_dir`

`p_dir::P`

: a temporary storage to the forward direction (or the $q$ in the formula)

**Constructor**

`ApproximateFiniteDifference(M, p, grad_f; kwargs...)`

**Keyword arguments**

`evaluation=`

`AllocatingEvaluation`

`()`

: specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (`AllocatingEvaluation`

) or whether they modify their input argument to return the result therein (`InplaceEvaluation`

). Since usually the first argument is the manifold, the modified argument is the second.`steplength=`

2^{-14}$: step length$c`` to approximate the gradient evaluations`retraction_method=`

`default_retraction_method`

`(M, typeof(p))`

: a retraction $\operatorname{retr}$ to use, see the section on retractions`vector_transport_method=`

`default_vector_transport_method`

`(M, typeof(p))`

: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports

`Manopt.ApproxHessianSymmetricRankOne`

— Type`ApproxHessianSymmetricRankOne{E, P, G, T, B<:AbstractBasis{ℝ}, VTR, R<:Real} <: AbstractApproxHessian`

A functor to approximate the Hessian by the symmetric rank one update.

**Fields**

`gradient!!`

: the gradient function (either allocating or mutating, see`evaluation`

parameter).`ν`

: a small real number to ensure that the denominator in the update does not become too small and thus the method does not break down.`vector_transport_method=`

`default_vector_transport_method`

`(M, typeof(p))`

: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports.

**Internal temporary fields**

`p_tmp`

: a temporary storage the current point`p`

.`grad_tmp`

: a temporary storage for the gradient at the current`p`

.`matrix`

: a temporary storage for the matrix representation of the approximating operator.`basis`

: a temporary storage for an orthonormal basis at the current`p`

.

**Constructor**

`ApproxHessianSymmetricRankOne(M, p, gradF; kwargs...)`

**Keyword arguments**

`initial_operator`

(`Matrix{Float64}(I, manifold_dimension(M), manifold_dimension(M))`

) the matrix representation of the initial approximating operator.`basis`

(`DefaultOrthonormalBasis()`

) an orthonormal basis in the tangent space of the initial iterate p.`nu`

(`-1`

)`evaluation=`

`AllocatingEvaluation`

`()`

: specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (`AllocatingEvaluation`

) or whether they modify their input argument to return the result therein (`InplaceEvaluation`

). Since usually the first argument is the manifold, the modified argument is the second.`vector_transport_method=`

`default_vector_transport_method`

`(M, typeof(p))`

: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports

`Manopt.ApproxHessianBFGS`

— Type`ApproxHessianBFGS{E, P, G, T, B<:AbstractBasis{ℝ}, VTR, R<:Real} <: AbstractApproxHessian`

A functor to approximate the Hessian by the BFGS update.

**Fields**

`gradient!!`

the gradient function (either allocating or mutating, see`evaluation`

parameter).`scale`

`vector_transport_method::AbstractVectorTransportMethodP`

: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports

**Internal temporary fields**

`p_tmp`

a temporary storage the current point`p`

.`grad_tmp`

a temporary storage for the gradient at the current`p`

.`matrix`

a temporary storage for the matrix representation of the approximating operator.`basis`

a temporary storage for an orthonormal basis at the current`p`

.

**Constructor**

`ApproxHessianBFGS(M, p, gradF; kwargs...)`

**Keyword arguments**

`initial_operator`

(`Matrix{Float64}(I, manifold_dimension(M), manifold_dimension(M))`

) the matrix representation of the initial approximating operator.`basis`

(`DefaultOrthonormalBasis()`

) an orthonormal basis in the tangent space of the initial iterate p.`nu`

(`-1`

)`evaluation=`

`AllocatingEvaluation`

`()`

: specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (`AllocatingEvaluation`

) or whether they modify their input argument to return the result therein (`InplaceEvaluation`

). Since usually the first argument is the manifold, the modified argument is the second.`vector_transport_method=`

`default_vector_transport_method`

`(M, typeof(p))`

: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports

as well as their (non-exported) common supertype

`Manopt.AbstractApproxHessian`

— Type`AbstractApproxHessian <: Function`

An abstract supertype for approximate Hessian functions, declares them also to be functions.

## Technical details

The `trust_regions`

solver requires the following functions of a manifold to be available

- A
`retract!`

`(M, q, p, X)`

; it is recommended to set the`default_retraction_method`

to a favourite retraction. If this default is set, a`retraction_method=`

does not have to be specified. - By default the stopping criterion uses the
`norm`

as well, to stop when the norm of the gradient is small, but if you implemented`inner`

, the norm is provided already. - if you do not provide an initial
`max_trust_region_radius`

, a`manifold_dimension`

is required. - A `copyto!
`(M, q, p)`

and`copy`

`(M,p)`

for points. - By default the tangent vectors are initialized calling
`zero_vector`

`(M,p)`

.

## Literature

- [ABG06]
- P.-A. Absil, C. Baker and K. Gallivan.
*Trust-Region Methods on Riemannian Manifolds*. Foundations of Computational Mathematics**7**, 303–330 (2006). - [CGT00]
- A. R. Conn, N. I. Gould and P. L. Toint.
*Trust Region Methods*(Society for Industrial and Applied Mathematics, 2000).