Manopt.gradient_descentFunction
gradient_descent(M, F, gradF, x)

$$$x_{k+1} = \operatorname{retr}_{x_k}\bigl( s_k\operatorname{grad}f(x_k) \bigr)$$$

with different choices of $s_k$ available (see stepsize option below).

Input

• M – a manifold $\mathcal M$
• F – a cost function $F: \mathcal M→ℝ$ to minimize
• gradF – the gradient $\operatorname{grad}F: \mathcal M → T\mathcal M$ of F
• x – an initial value $x ∈ \mathcal M$

Optional

... and the ones that are passed to decorate_options for decorators.

Output

• x_opt – the resulting (approximately critical) point of gradientDescent

OR

• options - the options returned by the solver (see return_options)
source
Manopt.gradient_descent!Function
gradient_descent!(M, F, gradF, x)

$$$x_{k+1} = \operatorname{retr}_{x_k}\bigl( s_k\operatorname{grad}f(x_k) \bigr)$$$

in place of x with different choices of $s_k$ available.

Input

• M – a manifold $\mathcal M$
• F – a cost function $F:\mathcal M→ℝ$ to minimize
• gradF – the gradient $\operatorname{grad}F:\mathcal M→ T\mathcal M$ of F
• x – an initial value $x ∈ \mathcal M$

For more options, especially Stepsizes for $s_k$, see gradient_descent

source

## Options

Manopt.GradientDescentOptionsType
GradientDescentOptions{P,T} <: AbstractGradientOptions

Describes a Gradient based descent algorithm, with

Fields

a default value is given in brackets if a parameter can be left out in initialization.

Constructor

GradientDescentOptions(M, x; initial_vector=zero_vector(M, x), kwargs...)

Generate gradient descent options, where initial_vector can be used to set the tangent vector to store the gradient to a certain type. All following fields are keyword arguments.

source

## Direction Update Rules

A field of the options is the direction, a DirectionUpdateRule, which by default IdentityUpdateRule just evaluates the gradient but can be enhanced for example to

Manopt.MomentumGradientType
MomentumGradient <: DirectionUpdateRule

Append a momentum to a gradient processor, where the last direction and last iterate are stored and the new is composed as $η_i = m*η_{i-1}' - s d_i$, where $sd_i$ is the current (inner) direction and $η_{i-1}'$ is the vector transported last direction multiplied by momentum $m$.

Fields

• gradient – (zero_vector(M,x0)) the last gradient/direction update added as momentum
• last_iterate - remember the last iterate for parallel transporting the last direction
• momentum – (0.2) factor for momentum
• direction – internal DirectionUpdateRule to determine directions to add the momentum to.
• vector_transport_method vector transport method to use

Constructors

MomentumGradient(
x0,
vector_transport_method=ParallelTransport(),
)

Add momentum to a gradient problem, where by default just a gradient evaluation is used Equivalently you can also use a Manifold M instead of the GradientProblem p.

MomentumGradient(
x0
s::DirectionUpdateRule=IdentityUpdateRule();
vector_transport_method=ParallelTransport(),
)

Add momentum to a stochastic gradient problem, where by default just a stochastic gradient evaluation is used

source
Manopt.AverageGradientType
AverageGradient <: DirectionUpdateRule

Add an average of gradients to a gradient processor. A set of previous directions (from the inner processor) and the last iterate are stored, average is taken after vector transporting them to the current iterates tangent space.

Fields

• gradients – (fill(zero_vector(M,x0),n)) the last n gradient/direction updates
• last_iterate – last iterate (needed to transport the gradients)
• direction – internal DirectionUpdateRule to determine directions to apply the averaging to
• vector_transport_method - vector transport method to use

Constructors

AverageGradient(
x0,
n::Int=10
s::DirectionUpdateRule=IdentityUpdateRule();
last_iterate = deepcopy(x0),
vector_transport_method = ParallelTransport()
)

Add average to a gradient problem, n determines the size of averaging and gradients can be prefilled with some history Equivalently you can also use a Manifold M instead of the GradientProblem p.

AverageGradient(
x0
n::Int=10
s::DirectionUpdateRule=IdentityUpdateRule();
last_iterate = deepcopy(x0),
vector_transport_method = ParallelTransport()
)

Add average to a stochastic gradient problem, n determines the size of averaging and gradients can be prefilled with some history

source
Manopt.NesterovType
Nesterov <: DirectionUpdateRule

Fields

• γ
• μ the strong convexity coefficient
• v (=$=v_k$, $v_0=x_0$) an interims point to compute the next gradient evaluation point y_k
• shrinkage (= i -> 0.8) a function to compute the shrinkage $β_k$ per iterate.

Let's assume $f$ is $L$-Lipschitz and $μ$-strongly convex. Given

• a step size $h_k<\frac{1}{L}$ (from the GradientDescentOptions
• a shrinkage parameter $β_k$
• and a current iterate $x_k$
• as well as the interims values $γ_k$ and $v_k$ from the previous iterate.

This compute a Nesterov type update using the following steps, see [ZhangSra2018]

1. Copute the positive root, i.e. $α_k∈(0,1)$ of $α^2 = h_k\bigl((1-α_k)γ_k+α_k μ\bigr)$.
2. Set $\bar γ_k+1 = (1-α_k)γ_k + α_kμ$
3. $y_k = \operatorname{retr}_{x_k}\Bigl(\frac{α_kγ_k}{γ_k + α_kμ}\operatorname{retr}^{-1}_{x_k}v_k \Bigr)$
4. $x_{k+1} = \operatorname{retr}_{y_k}(-h_k \operatorname{grad}f(y_k))$
5. $v_{k+1} = \operatorname{retr}_{y_k}\Bigl(\frac{(1-α_k)γ_k}{\barγ_k}\operatorname{retr}_{y_k}^{-1}(v_k) - \frac{α_k}{\bar γ_{k+1}}\operatorname{grad}f(y_k) \Bigr)$
6. $γ_{k+1} = \frac{1}{1+β_k}\bar γ_{k+1}$

Then the direction from $x_k$ to $x_k+1$, i.e. $d = \operatorname{retr}^{-1}_{x_k}x_{k+1}$ is returned.

Constructor

Nesterov(x0::P, γ=0.001, μ=0.9, schrinkage = k -> 0.8;
inverse_retraction_method=LogarithmicInverseRetraction())

Initialize the Nesterov acceleration, where x0 initializes v.

source

## Debug Actions

Manopt.DebugGradientType
DebugGradient <: DebugAction

debug for the gradient evaluated at the current iterate

Constructors

DebugGradient(; long=false, prefix= , format= "$prefix%s", io=stdout) display the short (false) or long (true) default text for the gradient, or set the prefix manually. Alternatively the complete format can be set. source Manopt.DebugGradientNormType DebugGradientNorm <: DebugAction debug for gradient evaluated at the current iterate. Constructors DebugGradientNorm([long=false,p=print]) display the short (false) or long (true) default text for the gradient norm. DebugGradientNorm(prefix[, p=print]) display the a prefix in front of the gradientnorm. source Manopt.DebugStepsizeType DebugStepsize <: DebugAction debug for the current step size. Constructors DebugStepsize(;long=false,prefix="step size:", format="$prefix%s", io=stdout)

display the a prefix` in front of the step size.

source

## Record Actions

• ZhangSra2018

H. Zhang, S. Sra: Towards Riemannian Accelerated Gradient Methods, Preprint, 2018, arXiv: 1806.02812