Manopt.adaptive_regularization_with_cubicsFunction
adaptive_regularization_with_cubics(M, f, grad_f, Hess_f, p=rand(M); kwargs...)
adaptive_regularization_with_cubics(M, mho, p=rand(M); kwargs...)

Solve an optimization problem on the manifold M by iteratively minimizing

$$$m_k(X) = f(p_k) + ⟨X, \operatorname{grad} f(p_k)⟩ + \frac{1}{2}⟨X, \operatorname{Hess} f(p_k)[X]⟩ + \frac{σ_k}{3}\lVert X \rVert^3$$$

on the tangent space at the current iterate $p_k$, where $X ∈ T_{p_k}\mathcal M$ and $σ_k > 0$ is a regularization parameter.

Let $X_k$ denote the minimizer of the model $m_k$ and use the model improvement

$$$ρ_k = \frac{f(p_k) - f(\operatorname{retr}_{p_k}(X_k))}{m_k(0) - m_k(X_k) + \frac{σ_k}{3}\lVert X_k\rVert^3}.$$$

With two thresholds $η_2 ≥ η_1 > 0$ set $p_{k+1} = \operatorname{retr}_{p_k}(X_k)$ if $ρ ≥ η_1$ and reject the candidate otherwise, that is, set $p_{k+1} = p_k$.

Further update the regularization parameter using factors $0 < γ_1 < 1 < γ_2$

$$$σ_{k+1} = \begin{cases} \max\{σ_{\min}, γ_1σ_k\} & \text{ if } ρ \geq η_2 &\text{ (the model was very successful)},\\ σ_k & \text{ if } ρ ∈ [η_1, η_2)&\text{ (the model was successful)},\\ γ_2σ_k & \text{ if } ρ < η_1&\text{ (the model was unsuccessful)}. \end{cases}$$$

For more details see [ABBC20].

Input

• M: a manifold $\mathcal M$
• f: a cost function $F: \mathcal M → ℝ$ to minimize
• grad_f: the gradient $\operatorname{grad}F: \mathcal M → T \mathcal M$ of $F$
• Hess_f: (optional) the Hessian $H( \mathcal M, x, ξ)$ of $F$
• p: an initial value $p ∈ \mathcal M$

For the case that no Hessian is provided, the Hessian is computed using finite difference, see ApproxHessianFiniteDifference.

the cost f and its gradient and Hessian might also be provided as a ManifoldHessianObjective

Keyword arguments

the default values are given in brackets

• σ: (100.0 / sqrt(manifold_dimension(M)) initial regularization parameter
• σmin: (1e-10) minimal regularization value $σ_{\min}$
• η1: (0.1) lower model success threshold
• η2: (0.9) upper model success threshold
• γ1: (0.1) regularization reduction factor (for the success case)
• γ2: (2.0) regularization increment factor (for the non-success case)
• evaluation: (AllocatingEvaluation) specify whether the gradient works by allocation (default) form grad_f(M, p) or InplaceEvaluation in place, that is of the form grad_f!(M, X, p) and analogously for the Hessian.
• retraction_method: (default_retraction_method(M, typeof(p))) a retraction to use
• initial_tangent_vector: (zero_vector(M, p)) initialize any tangent vector data,
• maxIterLanczos: (200) a shortcut to set the stopping criterion in the sub solver,
• ρ_regularization: (1e3) a regularization to avoid dividing by zero for small values of cost and model
• stopping_criterion: (StopAfterIteration(40) |StopWhenGradientNormLess(1e-9) |StopWhenAllLanczosVectorsUsed(maxIterLanczos))
• sub_state: LanczosState(M, copy(M, p); maxIterLanczos=maxIterLanczos, σ=σ) a state for the subproblem or an [AbstractEvaluationType](@ref) if the problem is a function.
• sub_objective: a shortcut to modify the objective of the subproblem used within in the
• sub_problem: DefaultManoptProblem(M, sub_objective) the problem (or a function) for the sub problem

All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective, respectively. If you provide the ManifoldGradientObjective directly, these decorations can still be specified

By default the debug= keyword is set to DebugIfEntry(:ρ_denominator, >(0); message="Denominator nonpositive", type=:error) to avoid that by rounding errors the denominator in the computation of ρ gets nonpositive.

source
Manopt.adaptive_regularization_with_cubics!Function
adaptive_regularization_with_cubics!(M, f, grad_f, Hess_f, p; kwargs...)
adaptive_regularization_with_cubics!(M, mho, p; kwargs...)

evaluate the Riemannian adaptive regularization with cubics solver in place of p.

Input

• M: a manifold $\mathcal M$
• f: a cost function $F: \mathcal M → ℝ$ to minimize
• grad_f: the gradient $\operatorname{grad}F: \mathcal M → T \mathcal M$ of $F$
• Hess_f: (optional) the Hessian $H( \mathcal M, x, ξ)$ of $F$
• p: an initial value $p ∈ \mathcal M$

For the case that no Hessian is provided, the Hessian is computed using finite difference, see ApproxHessianFiniteDifference.

the cost f and its gradient and Hessian might also be provided as a ManifoldHessianObjective

for more details and all options, see adaptive_regularization_with_cubics.

source

## State

Manopt.AdaptiveRegularizationStateType
AdaptiveRegularizationState{P,T} <: AbstractHessianSolverState

A state for the adaptive_regularization_with_cubics solver.

Fields

a default value is given in brackets if a parameter can be left out in initialization.

• η1, η2: (0.1, 0.9) bounds for evaluating the regularization parameter
• γ1, γ2: (0.1, 2.0) shrinking and expansion factors for regularization parameter σ
• p: (rand(M) the current iterate
• X: (zero_vector(M,p)) the current gradient $\operatorname{grad}f(p)$
• s: (zero_vector(M,p)) the tangent vector step resulting from minimizing the model problem in the tangent space $\mathcal T_{p} \mathcal M$
• σ: the current cubic regularization parameter
• σmin: (1e-7) lower bound for the cubic regularization parameter
• ρ_regularization: (1e3) regularization parameter for computing ρ.

When approaching convergence ρ may be difficult to compute with numerator and denominator approaching zero. Regularizing the ratio lets ρ go to 1 near convergence.

Furthermore the following integral fields are defined

• q: (copy(M,p)) a point for the candidates to evaluate model and ρ
• H: (copy(M, p, X)) the current Hessian, $\operatorname{Hess}F(p)[⋅]$
• S: (copy(M, p, X)) the current solution from the subsolver
• ρ: the current regularized ratio of actual improvement and model improvement.
• ρ_denominator: (one(ρ)) a value to store the denominator from the computation of ρ to allow for a warning or error when this value is non-positive.

Constructor

AdaptiveRegularizationState(M, p=rand(M); X=zero_vector(M, p); kwargs...)

Construct the solver state with all fields stated as keyword arguments.

source

## Sub solvers

There are several ways to approach the subsolver. The default is the first one.

## Lanczos iteration

Manopt.LanczosStateType
LanczosState{P,T,SC,B,I,R,TM,V,Y} <: AbstractManoptSolverState

Solve the adaptive regularized subproblem with a Lanczos iteration

Fields

• stop: the stopping criterion
• σ: the current regularization parameter
• X: the Iterate
• Lanczos_vectors: the obtained Lanczos vectors
• tridig_matrix: the tridiagonal coefficient matrix T
• coefficients: the coefficients $y_1,...y_k$ that determine the solution
• Hp: a temporary vector containing the evaluation of the Hessian
• Hp_residual: a temporary vector containing the residual to the Hessian
• S: the current obtained / approximated solution
source

There is a generic objective, that implements the sub problem

Manopt.AdaptiveRagularizationWithCubicsModelObjectiveType
AdaptiveRagularizationWithCubicsModelObjective

A model for the adaptive regularization with Cubics

$$$m(X) = f(p) + ⟨\operatorname{grad} f(p), X ⟩_p + \frac{1}{2} ⟨\operatorname{Hess} f(p)[X], X⟩_p + \frac{σ}{3} \lVert X \rVert^3,$$$

cf. Eq. (33) in [ABBC20]

Fields

Constructors

AdaptiveRagularizationWithCubicsModelObjective(mho, σ=1.0)

with either an AbstractManifoldHessianObjective objective or an decorator containing such an objective.

source

Since the sub problem is given on the tangent space, you have to provide

arc_obj = AdaptiveRagularizationWithCubicsModelObjective(mho, σ)
sub_problem = DefaultProblem(TangentSpaceAt(M,p), arc_obj)

where mho is the Hessian objective of f to solve. Then use this for the sub_problem keyword and use your favourite gradient based solver for the sub_state keyword, for example a ConjugateGradientDescentState

Manopt.StopWhenAllLanczosVectorsUsedType
StopWhenAllLanczosVectorsUsed <: StoppingCriterion

When an inner iteration has used up all Lanczos vectors, then this stopping criterion is a fallback / security stopping criterion to not access a non-existing field in the array allocated for vectors.

Note that this stopping criterion (for now) is only implemented for the case that an AdaptiveRegularizationState when using a LanczosState subsolver

Fields

• maxLanczosVectors: maximal number of Lanczos vectors
• at_iteration indicates at which iteration (including i=0) the stopping criterion was fulfilled and is -1 while it is not fulfilled.

Constructor

StopWhenAllLanczosVectorsUsed(maxLancosVectors::Int)
source
Manopt.StopWhenFirstOrderProgressType
StopWhenFirstOrderProgress <: StoppingCriterion

A stopping criterion related to the Riemannian adaptive regularization with cubics (ARC) solver indicating that the model function at the current (outer) iterate,

$$$m(X) = f(p) + + \frac{1}{2} + \frac{σ}{3} \lVert X \rVert^3,$$$

defined on the tangent space $T_{p}\mathcal M$ fulfills at the current iterate $X_k$ that

$$$m(X_k) \leq m(0) \quad\text{ and }\quad \lVert \operatorname{grad} m(X_k) \rVert ≤ θ \lVert X_k \rVert^2$$$

Fields

• θ: the factor $θ$ in the second condition
• at_iteration: indicates at which iteration (including i=0) the stopping criterion was fulfilled and is -1 while it is not fulfilled.
source

## Technical details

The adaptive_regularization_with_cubics requires the following functions of a manifolds to be available

Furthermore, within the Lanczos subsolver, generating a random vector (at p) using rand!(M, X; vector_at=p)in place ofX is required

## Literature

[ABBC20]
N. Agarwal, N. Boumal, B. Bullins and C. Cartis. Adaptive regularization with cubics on manifolds. Mathematical Programming (2020).