Stepsize and line search
Most iterative algorithms determine a direction along which the algorithm shall proceed and determine a step size to find the next iterate. How advanced the step size computation can be implemented depends (among others) on the properties the corresponding problem provides.
Within Manopt.jl
, the step size determination is implemented as a functor
which is a subtype of Stepsize
based on
Manopt.Stepsize
— TypeStepsize
An abstract type for the functors representing step sizes. These are callable structures. The naming scheme is TypeOfStepSize
, for example ConstantStepsize
.
Every Stepsize has to provide a constructor and its function has to have the interface (p,o,i)
where a AbstractManoptProblem
as well as AbstractManoptSolverState
and the current number of iterations are the arguments and returns a number, namely the stepsize to use.
For most it is adviable to employ a ManifoldDefaultsFactory
. Then the function creating the factory should either be called TypeOf
or if that is confusing or too generic, TypeOfLength
See also
Usually, a constructor should take the manifold M
as its first argument, for consistency, to allow general step size functors to be set up based on default values that might depend on the manifold currently under consideration.
Currently, the following step sizes are available
Manopt.AdaptiveWNGradient
— FunctionAdaptiveWNGradient(; kwargs...)
AdaptiveWNGradient(M::AbstractManifold; kwargs...)
A stepsize based on the adaptive gradient method introduced by [GS23].
Given a positive threshold $\hat{c} ∈ ℕ$, an minimal bound $b_{\text{min}} > 0$, an initial $b_0 ≥ b_{\text{min}}$, and a gradient reduction factor threshold $α ∈ [0,1)$.
Set $c_0=0$ and use $ω_0 = \lVert \operatorname{grad} f(p_0) \rVert_{p_0}$.
For the first iterate use the initial step size $s_0 = \frac{1}{b_0}$.
Then, given the last gradient $X_{k-1} = \operatorname{grad} f(x_{k-1})$, and a previous $ω_{k-1}$, the values $(b_k, ω_k, c_k)$ are computed using $X_k = \operatorname{grad} f(p_k)$ and the following cases
If $\lVert X_k \rVert_{p_k} ≤ αω_{k-1}$, then let $\hat{b}_{k-1} ∈ [b_{\text{min}},b_{k-1}]$ and set
\[(b_k, ω_k, c_k) = \begin{cases} \bigl(\hat{b}_{k-1}, \lVert X_k \rVert_{p_k}, 0 \bigr) & \text{ if } c_{k-1}+1 = \hat{c}\\\\ \bigl( b_{k-1} + \frac{\lVert X_k \rVert_{p_k}^2}{b_{k-1}}, ω_{k-1}, c_{k-1}+1 \Bigr) & \text{ if } c_{k-1}+1<\hat{c}\end{cases}\]
If $\lVert X_k \rVert_{p_k} > αω_{k-1}$, the set
\[(b_k, ω_k, c_k) = \Bigl( b_{k-1} + \frac{\lVert X_k \rVert_{p_k}^2}{b_{k-1}}, ω_{k-1}, 0 \Bigr)\]
and return the step size $s_k = \frac{1}{b_k}$.
Note that for $α=0$ this is the Riemannian variant of WNGRad
.
Keyword arguments
adaptive=true
: switches thegradient_reduction
α
(if
true) to
0`.alternate_bound = (bk, hat_c) -> min(gradient_bound == 0 ? 1.0 : gradient_bound, max(minimal_bound, bk / (3 * hat_c))
: how to determine $\hat{k}_k$ as a function of(bmin, bk, hat_c) -> hat_bk
count_threshold=4
: anInteger
for $\hat{c}$gradient_reduction::R=adaptive ? 0.9 : 0.0
: the gradient reduction factor threshold $α ∈ [0,1)$gradient_bound=norm(M, p, X)
: the bound $b_k$.minimal_bound=1e-4
: the value $b_{\text{min}}$p=
rand
(M)
: a point on the manifold $\mathcal M$only used to define thegradient_bound
X=
zero_vector
(M, p)
: a tangent vector at the point $p$ on the manifold $\mathcal M$only used to define thegradient_bound
Manopt.ArmijoLinesearch
— FunctionArmijoLinesearch(; kwargs...)
ArmijoLinesearch(M::AbstractManifold; kwargs...)
Specify a step size that performs an Armijo line search. Given a Function $f:\mathcal M→ℝ$ and its Riemannian Gradient $\operatorname{grad}f: \mathcal M→T\mathcal M$, the curent point $p∈\mathcal M$ and a search direction $X∈T_{p}\mathcal M$.
Then the step size $s$ is found by reducing the initial step size $s$ until
\[f(\operatorname{retr}_p(sX)) ≤ f(p) - τs ⟨ X, \operatorname{grad}f(p) ⟩_p\]
is fulfilled. for a sufficient decrease value $τ ∈ (0,1)$.
To be a bit more optimistic, if $s$ already fulfils this, a first search is done, increasing the given $s$ until for a first time this step does not hold.
Overall, we look for step size, that provides enough decrease, see [Bou23, p. 58] for more information.
Keyword arguments
additional_decrease_condition=(M, p) -> true
: specify an additional criterion that has to be met to accept a step size in the decreasing loopadditional_increase_condition::IF=(M, p) -> true
: specify an additional criterion that has to be met to accept a step size in the (initial) increase loopcandidate_point=allocate_result(M, rand)
: speciy a point to be used as memory for the candidate points.contraction_factor=0.95
: how to update $s$ in the decrease stepinitial_stepsize=1.0
`: specify an initial step sizeinitial_guess=
armijo_initial_guess
: Compute the initial step size of a line search based on this function. The funtion required is(p,s,k,l) -> α
and computes the initial step size $α$ based on aAbstractManoptProblem
p
,AbstractManoptSolverState
s
, the current iteratek
and a last step sizel
.retraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsstop_when_stepsize_less=0.0
: a safeguard, stop when the decreasing step is below this (nonnegative) bound.stop_when_stepsize_exceeds=max_stepsize(M)
: a safeguard to not choose a too long step size when initially increasingstop_increasing_at_step=100
: stop the initial increasing loop after this amount of steps. Set to0
to never increase in the beginningstop_decreasing_at_step=1000
: maximal number of Armijo decreases / tests to performsufficient_decrease=0.1
: the sufficient decrease parameter $τ$
For the stop safe guards you can pass :Messages
to a debug=
to see @info
messages when these happen.
This function generates a ManifoldDefaultsFactory
for ArmijoLinesearchStepsize
. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState
is available.
Manopt.ConstantLength
— FunctionConstantLength(s; kwargs...)
ConstantLength(M::AbstractManifold, s; kwargs...)
Specify a Stepsize
that is constant.
Input
M
(optional)
s=min( injectivity_radius(M)/2, 1.0)
: the length to use.
Keyword argument
type::Symbol=relative
specify the type of constant step size.:relative
– scale the gradient tangent vector $X$ to $s*X$:absolute
– scale the gradient to an absolute step length $s$, that is $\frac{s}{\lVert X \rVert_{}}X$
This function generates a ManifoldDefaultsFactory
for ConstantStepsize
. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState
is available.
Manopt.DecreasingLength
— FunctionDegreasingLength(; kwargs...)
DecreasingLength(M::AbstractManifold; kwargs...)
Specify a [Stepsize
] that is decreasing as ``s_k = \frac{(l - ak)f^i}{(k+s)^e} with the following
Keyword arguments
exponent=1.0
: the exponent $e$ in the denominatorfactor=1.0
: the factor $f$ in the nominatorlength=min(injectivity_radius(M)/2, 1.0)
: the initial step size $l$.subtrahend=0.0
: a value $a$ that is subtracted every iterationshift=0.0
: shift the denominator iterator $k$ by $s$.type::Symbol=relative
specify the type of constant step size.:relative
– scale the gradient tangent vector $X$ to $s_k*X$:absolute
– scale the gradient to an absolute step length $s_k$, that is $\frac{s_k}{\lVert X \rVert_{}}X$
This function generates a ManifoldDefaultsFactory
for DecreasingStepsize
. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState
is available.
Manopt.NonmonotoneLinesearch
— FunctionNonmonotoneLinesearch(; kwargs...)
NonmonotoneLinesearch(M::AbstractManifold; kwargs...)
A functor representing a nonmonotone line search using the Barzilai-Borwein step size [IP17].
This method first computes
(x -> p, F-> f)
\[y_{k} = \operatorname{grad}f(p_{k}) - \mathcal T_{p_k←p_{k-1}}\operatorname{grad}f(p_{k-1})\]
and
\[s_{k} = - α_{k-1} ⋅ \mathcal T_{p_k←p_{k-1}}\operatorname{grad}f(p_{k-1}),\]
where $α_{k-1}$ is the step size computed in the last iteration and $\mathcal T_{⋅←⋅}$ is a vector transport. Then the Barzilai—Borwein step size is
\[α_k^{\text{BB}} = \begin{cases} \min(α_{\text{max}}, \max(α_{\text{min}}, τ_{k})), & \text{if} ⟨s_{k}, y_{k}⟩_{p_k} > 0,\\\\ α_{\text{max}}, & \text{else,}\end{cases}\]
where
\[τ_{k} = \frac{⟨s_{k}, s_{k}⟩_{p_k}}{⟨s_{k}, y_{k}⟩_{p_k}},\]
if the direct strategy is chosen, or
\[τ_{k} = \frac{⟨s_{k}, y_{k}⟩_{p_k}}{⟨y_{k}, y_{k}⟩_{p_k}},\]
in case of the inverse strategy or an alternation between the two in cases for the alternating strategy. Then find the smallest $h = 0, 1, 2, …$ such that
\[f(\operatorname{retr}_{p_k}(- σ^h α_k^{\text{BB}} \operatorname{grad}f(p_k))) ≤ \max_{1 ≤ j ≤ \max(k+1,m)} f(p_{k+1-j}) - γ σ^h α_k^{\text{BB}} ⟨\operatorname{grad}F(p_k), \operatorname{grad}F(p_k)⟩_{p_k},\]
where $σ ∈ (0,1)$ is a step length reduction factor , $m$ is the number of iterations after which the function value has to be lower than the current one and $γ ∈ (0,1)$ is the sufficient decrease parameter. Finally the step size is computed as
\[α_k = σ^h α_k^{\text{BB}}.\]
Keyword arguments
p=
rand
(M)
: a point on the manifold $\mathcal M$to store an interim resultp=allocate_result(M, rand)
: to store an interim resultinitial_stepsize=1.0
: the step size to start the search withmemory_size=10
: number of iterations after which the cost value needs to be lower than the current onebb_min_stepsize=1e-3
: lower bound for the Barzilai-Borwein step size greater than zerobb_max_stepsize=1e3
: upper bound for the Barzilai-Borwein step size greater than min_stepsizeretraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsstrategy=direct
: defines if the new step size is computed using the:direct
,:indirect
or:alternating
strategystorage=
StoreStateAction
(M; store_fields=[:Iterate, :Gradient])
: increase efficiency by using aStoreStateAction
for:Iterate
and:Gradient
.stepsize_reduction=0.5
: step size reduction factor contained in the interval $(0,1)$sufficient_decrease=1e-4
: sufficient decrease parameter contained in the interval $(0,1)$stop_when_stepsize_less=0.0
: smallest stepsize when to stop (the last one before is taken)stop_when_stepsize_exceeds=
max_stepsize
(M, p)
): largest stepsize when to stop to avoid leaving the injectivity radiusstop_increasing_at_step=100
: last step to increase the stepsize (phase 1),stop_decreasing_at_step=1000
: last step size to decrease the stepsize (phase 2),
Manopt.Polyak
— FunctionPolyak(; kwargs...)
Polyak(M::AbstractManifold; kwargs...)
Compute a step size according to a method propsed by Polyak, cf. the Dynamic step size discussed in Section 3.2 of [Ber15]. This has been generalised here to both the Riemannian case and to approximate the minimum cost value.
Let $f_{\text{best}$ be the best cost value seen until now during some iterative optimisation algorithm and let $γ_k$ be a sequence of numbers that is square summable, but not summable.
Then the step size computed here reads
\[s_k = \frac{f(p^{(k)}) - f_{\text{best} + γ_k}{\lVert ∂f(p^{(k)})} \rVert_{}},\]
where $∂f$ denotes a nonzero-subgradient of $f$ at the current iterate $p^{(k)}$.
Constructor
Polyak(; γ = k -> 1/k, initial_cost_estimate=0.0)
initialize the Polyak stepsize to a certain sequence and an initial estimate of $f_{ ext{best}}$.
This function generates a ManifoldDefaultsFactory
for PolyakStepsize
. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState
is available.
Manopt.WolfePowellLinesearch
— FunctionWolfePowellLinesearch(; kwargs...)
WolfePowellLinesearch(M::AbstractManifold; kwargs...)
Perform a lineseach to fulfull both the Armijo-Goldstein conditions
\[f\bigl( \operatorname{retr}_{p}(αX) \bigr) ≤ f(p) + c_1 α_k ⟨\operatorname{grad} f(p), X⟩_{p}\]
as well as the Wolfe conditions
\[\frac{\mathrm{d}}{\mathrm{d}t} f\bigl(\operatorname{retr}_{p}(tX)\bigr) \Big\vert_{t=α} ≥ c_2 \frac{\mathrm{d}}{\mathrm{d}t} f\bigl(\operatorname{retr}_{p}(tX)\bigr)\Big\vert_{t=0}.\]
for some given sufficient decrease coefficient $c_1$ and some sufficient curvature condition coefficient$c_2$.
This is adopted from [NW06, Section 3.1]
Keyword arguments
sufficient_decrease=10^(-4)
sufficient_curvature=0.999
p::P
: a point on the manifold $\mathcal M$as temporary storage for candidatesX::T
: a tangent vector at the point $p$ on the manifold $\mathcal M$as type of memory allocated for the candidates direction and tangentmax_stepsize=
max_stepsize
(M, p)
: largest stepsize allowed here.retraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsstop_when_stepsize_less=0.0
: smallest stepsize when to stop (the last one before is taken)vector_transport_method=
default_vector_transport_method
(M, typeof(p))
: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Manopt.WolfePowellBinaryLinesearch
— FunctionWolfePowellBinaryLinesearch(; kwargs...)
WolfePowellBinaryLinesearch(M::AbstractManifold; kwargs...)
Perform a lineseach to fulfull both the Armijo-Goldstein conditions for some given sufficient decrease coefficient $c_1$ and some sufficient curvature condition coefficient$c_2$. Compared to WolfePowellLinesearch
which tries a simpler method, this linesearch performs the following algorithm
With
\[A(t) = f(p_+) ≤ c_1 t ⟨\operatorname{grad}f(p), X⟩_{x} \quad\text{ and }\quad W(t) = ⟨\operatorname{grad}f(x_+), \mathcal T_{p_+←p}X⟩_{p_+} ≥ c_2 ⟨X, \operatorname{grad}f(x)⟩_x,\]
where $p_+ =\operatorname{retr}_p(tX)$ is the current trial point, and $\mathcal T_{⋅←⋅}$ denotes a vector transport. Then the following Algorithm is performed similar to Algorithm 7 from [Hua14]
- set $α=0$, $β=∞$ and $t=1$.
- While either $A(t)$ does not hold or $W(t)$ does not hold do steps 3-5.
- If $A(t)$ fails, set $β=t$.
- If $A(t)$ holds but $W(t)$ fails, set $α=t$.
- If $β<∞$ set $t=\frac{α+β}{2}$, otherwise set $t=2α$.
Keyword arguments
sufficient_decrease=10^(-4)
sufficient_curvature=0.999
max_stepsize=
max_stepsize
(M, p)
: largest stepsize allowed here.retraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsstop_when_stepsize_less=0.0
: smallest stepsize when to stop (the last one before is taken)vector_transport_method=
default_vector_transport_method
(M, typeof(p))
: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Some step sizes use max_stepsize
function as a rough upper estimate for the trust region size. It is by default equal to injectivity radius of the exponential map but in some cases a different value is used. For the FixedRankMatrices
manifold an estimate from Manopt is used. Tangent bundle with the Sasaki metric has 0 injectivity radius, so the maximum stepsize of the underlying manifold is used instead. Hyperrectangle
also has 0 injectivity radius and an estimate based on maximum of dimensions along each index is used instead. For manifolds with corners, however, a line search capable of handling break points along the projected search direction should be used, and such algorithms do not call max_stepsize
.
Internally these step size functions create a ManifoldDefaultsFactory
. Internally these use
Manopt.armijo_initial_guess
— Methodarmijo_initial_guess(mp::AbstractManoptProblem, s::AbstractManoptSolverState, k, l)
Input
mp
: theAbstractManoptProblem
we are aiminig to minimizes
: theAbstractManoptSolverState
for the current solverk
: the current iterationl
: the last step size computed in the previous iteration.
Return an initial guess for the ArmijoLinesearchStepsize
.
The default provided is based on the max_stepsize
(M)
, which we denote by $m$. Let further $X$ be the current descent direction with norm $n=\lVert X \rVert_{p}$ its length. Then this (default) initial guess returns
- $l$ if $m$ is not finite
- $\min(l, \frac{m}{n})$ otherwise
This ensures that the initial guess does not yield to large (initial) steps.
Manopt.default_stepsize
— Methoddefault_stepsize(M::AbstractManifold, ams::AbstractManoptSolverState)
Returns the default Stepsize
functor used when running the solver specified by the AbstractManoptSolverState
ams
running with an objective on the AbstractManifold M
.
Manopt.get_last_stepsize
— Methodget_last_stepsize(amp::AbstractManoptProblem, ams::AbstractManoptSolverState, vars...)
return the last computed stepsize stored within AbstractManoptSolverState
ams
when solving the AbstractManoptProblem
amp
.
This method takes into account that ams
might be decorated. In case this returns NaN
, a concrete call to the stored stepsize is performed. For this, usually, the first of the vars...
should be the current iterate.
Manopt.get_last_stepsize
— Methodget_last_stepsize(::Stepsize, vars...)
return the last computed stepsize from within the stepsize. If no last step size is stored, this returns NaN
.
Manopt.get_stepsize
— Methodget_stepsize(amp::AbstractManoptProblem, ams::AbstractManoptSolverState, vars...)
return the stepsize stored within AbstractManoptSolverState
ams
when solving the AbstractManoptProblem
amp
. This method also works for decorated options and the Stepsize
function within the options, by default stored in ams.stepsize
.
Manopt.linesearch_backtrack!
— Method(s, msg) = linesearch_backtrack!(M, q, F, p, X, s, decrease, contract η = -X, f0 = f(p))
Perform a line search backtrack in-place of q
. For all details and options, see linesearch_backtrack
Manopt.linesearch_backtrack
— Method(s, msg) = linesearch_backtrack(M, F, p, X, s, decrease, contract η = -X, f0 = f(p); kwargs...)
(s, msg) = linesearch_backtrack!(M, q, F, p, X, s, decrease, contract η = -X, f0 = f(p); kwargs...)
perform a line search
- on manifold
M
- for the cost function
f
, - at the current point
p
- with current gradient provided in
X
- an initial stepsize
s
- a sufficient
decrease
- a
contract
ion factor $σ$ - a search direction $η = -X$
- an offset, $f_0 = F(x)$
Keyword arguments
retraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsstop_when_stepsize_less=0.0
: to avoid numerical underflowstop_when_stepsize_exceeds=
max_stepsize
(M, p) / norm(M, p, η)
) to avoid leaving the injectivity radius on a manifoldstop_increasing_at_step=100
: stop the initial increase of step size after these many stepsstop_decreasing_at_step=
1000`: stop the decreasing search after these many stepsadditional_increase_condition=(M,p) -> true
: impose an additional condition for an increased step size to be acceptedadditional_decrease_condition=(M,p) -> true
: impose an additional condition for an decreased step size to be accepted
These keywords are used as safeguards, where only the max stepsize is a very manifold specific one.
Return value
A stepsize s
and a message msg
(in case any of the 4 criteria hit)
Manopt.max_stepsize
— Methodmax_stepsize(M::AbstractManifold, p)
max_stepsize(M::AbstractManifold)
Get the maximum stepsize (at point p
) on manifold M
. It should be used to limit the distance an algorithm is trying to move in a single step.
By default, this returns injectivity_radius
(M)
, if this exists. If this is not available on the the method returns Inf
.
Manopt.AdaptiveWNGradientStepsize
— TypeAdaptiveWNGradientStepsize{I<:Integer,R<:Real,F<:Function} <: Stepsize
A functor problem, state, k, X) -> s to an adaptive gradient method introduced by [GrapigliaStella:2023](@cite). See [
AdaptiveWNGradient`](@ref) for the mathematical details.
Fields
count_threshold::I
: anInteger
for $\hat{c}$minimal_bound::R
: the value for $b_{\text{min}}$alternate_bound::F
: how to determine $\hat{k}_k$ as a function of(bmin, bk, hat_c) -> hat_bk
gradient_reduction::R
: the gradient reduction factor threshold $α ∈ [0,1)$gradient_bound::R
: the bound $b_k$.weight::R
: $ω_k$ initialised to $ω_0 =$norm(M, p, X)
if this is not zero,1.0
otherwise.count::I
: $c_k$, initialised to $c_0 = 0$.
Constructor
AdaptiveWNGrad(M::AbstractManifold; kwargs...)
Keyword arguments
adaptive=true
: switches thegradient_reduction
α
(if
true) to
0`.alternate_bound = (bk, hat_c) -> min(gradient_bound == 0 ? 1.0 : gradient_bound, max(minimal_bound, bk / (3 * hat_c))
count_threshold=4
gradient_reduction::R=adaptive ? 0.9 : 0.0
gradient_bound=norm(M, p, X)
minimal_bound=1e-4
p=
rand
(M)
: a point on the manifold $\mathcal M$only used to define thegradient_bound
X=
zero_vector
(M, p)
: a tangent vector at the point $p$ on the manifold $\mathcal M$only used to define thegradient_bound
Manopt.ArmijoLinesearchStepsize
— TypeArmijoLinesearchStepsize <: Linesearch
A functor problem, state, k, X) -> s to provide an Armijo line search to compute step size, based on the search direction
X`
Fields
candidate_point
: to store an interim resultinitial_stepsize
: and initial step sizeretraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionscontraction_factor
: exponent for line search reductionsufficient_decrease
: gain within Armijo's rulelast_stepsize
: the last step size to start the search withinitial_guess
: a function to provide an initial guess for the step size, it maps(m,p,k,l) -> α
based on aAbstractManoptProblem
p
,AbstractManoptSolverState
s
, the current iteratek
and a last step sizel
. It returns the initial guessα
.additional_decrease_condition
: specify a condition a new point has to additionally fulfill. The default accepts all points.additional_increase_condition
: specify a condtion that additionally to checking a valid increase has to be fulfilled. The default accepts all points.stop_when_stepsize_less
: smallest stepsize when to stop (the last one before is taken)stop_when_stepsize_exceeds
: largest stepsize when to stop.stop_increasing_at_step
: last step to increase the stepsize (phase 1),stop_decreasing_at_step
: last step size to decrease the stepsize (phase 2),
Pass :Messages
to a debug=
to see @info
s when these happen.
Constructor
ArmijoLinesearchStepsize(M::AbstractManifold; kwarg...)
with the fields keyword arguments and the retraction is set to the default retraction on M
.
Keyword arguments
candidate_point=
(allocate_result(M, rand)
)initial_stepsize=1.0
retraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionscontraction_factor=0.95
sufficient_decrease=0.1
last_stepsize=initialstepsize
initial_guess=
armijo_initial_guess
– (p,s,i,l) -> l
stop_when_stepsize_less=0.0
: stop when the stepsize decreased below this version.stop_when_stepsize_exceeds=[
max_step](@ref)
(M)`: provide an absolute maximal step size.stop_increasing_at_step=100
: for the initial increase test, stop after these many stepsstop_decreasing_at_step=1000
: in the backtrack, stop after these many steps
Manopt.ConstantStepsize
— TypeConstantStepsize <: Stepsize
A functor (problem, state, ...) -> s
to provide a constant step size s
.
Fields
length
: constant value for the step sizetype
: a symbol that indicates whether the stepsize is relatively (:relative), with respect to the gradient norm, or absolutely (:absolute) constant.
Constructors
ConstantStepsize(s::Real, t::Symbol=:relative)
initialize the stepsize to a constant s
of type t
.
ConstantStepsize(
M::AbstractManifold=DefaultManifold(),
s=min(1.0, injectivity_radius(M)/2);
type::Symbol=:relative
)
Manopt.DecreasingStepsize
— TypeDecreasingStepsize()
A functor (problem, state, ...) -> s
to provide a constant step size s
.
Fields
exponent
: a value $e$ the current iteration numbers $e$th exponential is taken offactor
: a value $f$ to multiply the initial step size with every iterationlength
: the initial step size $l$.subtrahend
: a value $a$ that is subtracted every iterationshift
: shift the denominator iterator $i$ by $s$`.type
: a symbol that indicates whether the stepsize is relatively (:relative), with respect to the gradient norm, or absolutely (:absolute) constant.
In total the complete formulae reads for the $i$th iterate as
\[s_i = \frac{(l - i a)f^i}{(i+s)^e}\]
and hence the default simplifies to just $s_i = rac{l}{i}$
Constructor
DecreasingStepsize(M::AbstractManifold;
length=min(injectivity_radius(M)/2, 1.0),
factor=1.0,
subtrahend=0.0,
exponent=1.0,
shift=0.0,
type=:relative,
)
initializes all fields, where none of them is mandatory and the length is set to half and to $1$ if the injectivity radius is infinite.
Manopt.Linesearch
— TypeLinesearch <: Stepsize
An abstract functor to represent line search type step size determinations, see Stepsize
for details. One example is the ArmijoLinesearchStepsize
functor.
Compared to simple step sizes, the line search functors provide an interface of the form (p,o,i,X) -> s
with an additional (but optional) fourth parameter to provide a search direction; this should default to something reasonable, most prominently the negative gradient.
Manopt.NonmonotoneLinesearchStepsize
— TypeNonmonotoneLinesearchStepsize{P,T,R<:Real} <: Linesearch
A functor representing a nonmonotone line search using the Barzilai-Borwein step size [IP17].
Fields
initial_stepsize=1.0
: the step size to start the search withmemory_size=10
: number of iterations after which the cost value needs to be lower than the current onebb_min_stepsize=1e-3
: lower bound for the Barzilai-Borwein step size greater than zerobb_max_stepsize=1e3
: upper bound for the Barzilai-Borwein step size greater than min_stepsizeretraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsstrategy=direct
: defines if the new step size is computed using the:direct
,:indirect
or:alternating
strategystorage
: (for:Iterate
and:Gradient
) aStoreStateAction
stepsize_reduction
: step size reduction factor contained in the interval (0,1)sufficient_decrease
: sufficient decrease parameter contained in the interval (0,1)vector_transport_method=
default_vector_transport_method
(M, typeof(p))
: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transportscandidate_point
: to store an interim resultstop_when_stepsize_less
: smallest stepsize when to stop (the last one before is taken)stop_when_stepsize_exceeds
: largest stepsize when to stop.stop_increasing_at_step
: last step to increase the stepsize (phase 1),stop_decreasing_at_step
: last step size to decrease the stepsize (phase 2),
Constructor
NonmonotoneLinesearchStepsize(M::AbstractManifold; kwargs...)
Keyword arguments
p=allocate_result(M, rand)
: to store an interim resultinitial_stepsize=1.0
memory_size=10
bb_min_stepsize=1e-3
bb_max_stepsize=1e3
retraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsstrategy=direct
storage=[
StoreStateAction](@ref)
(M; store_fields=[:Iterate, :Gradient])``stepsize_reduction=0.5
sufficient_decrease=1e-4
stop_when_stepsize_less=0.0
stop_when_stepsize_exceeds=
max_stepsize
(M, p)
)stop_increasing_at_step=100
stop_decreasing_at_step=1000
vector_transport_method=
default_vector_transport_method
(M, typeof(p))
: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Manopt.PolyakStepsize
— TypePolyakStepsize <: Stepsize
A functor (problem, state, ...) -> s
to provide a step size due to Polyak, cf. Section 3.2 of [Ber15].
Fields
γ
: a functionk -> ...
representing a seuqnce.best_cost_value
: storing the best cost value
Constructor
PolyakStepsize(;
γ = i -> 1/i,
initial_cost_estimate=0.0
)
Construct a stepsize of Polyak type.
See also
Manopt.WolfePowellBinaryLinesearchStepsize
— TypeWolfePowellBinaryLinesearchStepsize{R} <: Linesearch
Do a backtracking line search to find a step size $α$ that fulfils the Wolfe conditions along a search direction $X$ starting from $p$. See WolfePowellBinaryLinesearch
for the math details.
Fields
sufficient_decrease::R
,sufficient_curvature::R
two constants in the line searchlast_stepsize::R
max_stepsize::R
retraction_method::AbstractRetractionMethod
: a retraction $\operatorname{retr}$ to use, see the section on retractionsstop_when_stepsize_less::R
: a safeguard to stop when the stepsize gets too smallvector_transport_method::AbstractVectorTransportMethodP
: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Keyword arguments
sufficient_decrease=10^(-4)
sufficient_curvature=0.999
max_stepsize=
max_stepsize
(M, p)
: largest stepsize allowed here.retraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsstop_when_stepsize_less=0.0
: smallest stepsize when to stop (the last one before is taken)vector_transport_method=
default_vector_transport_method
(M, typeof(p))
: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Manopt.WolfePowellLinesearchStepsize
— TypeWolfePowellLinesearchStepsize{R<:Real} <: Linesearch
Do a backtracking line search to find a step size $α$ that fulfils the Wolfe conditions along a search direction $X$ starting from $p$. See WolfePowellLinesearch
for the math details
Fields
sufficient_decrease::R
,sufficient_curvature::R
two constants in the line searchcandidate_direction::T
: a tangent vector at the point $p$ on the manifold $\mathcal M$candidate_point::P
: a point on the manifold $\mathcal M$as temporary storage for candidatescandidate_tangent::T
: a tangent vector at the point $p$ on the manifold $\mathcal M$last_stepsize::R
max_stepsize::R
retraction_method::AbstractRetractionMethod
: a retraction $\operatorname{retr}$ to use, see the section on retractionsstop_when_stepsize_less::R
: a safeguard to stop when the stepsize gets too smallvector_transport_method::AbstractVectorTransportMethodP
: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Keyword arguments
sufficient_decrease=10^(-4)
sufficient_curvature=0.999
p::P
: a point on the manifold $\mathcal M$as temporary storage for candidatesX::T
: a tangent vector at the point $p$ on the manifold $\mathcal M$as type of memory allocated for the candidates direction and tangentmax_stepsize=
max_stepsize
(M, p)
: largest stepsize allowed here.retraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsstop_when_stepsize_less=0.0
: smallest stepsize when to stop (the last one before is taken)vector_transport_method=
default_vector_transport_method
(M, typeof(p))
: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Some solvers have a different iterate from the one used for the line search. Then the following state can be used to wrap these locally
Manopt.StepsizeState
— TypeStepsizeState{P,T} <: AbstractManoptSolverState
A state to store a point and a descent direction used within a linesearch, if these are different from the iterate and search direction of the main solver.
Fields
p::P
: a point on a manifoldX::T
: a tangent vector atp
.
Constructor
StepsizeState(p,X)
StepsizeState(M::AbstractManifold; p=rand(M), x=zero_vector(M,p)
See also
Literature
- [Ber15]
- D. P. Bertsekas. Convex Optimization Algorithms (Athena Scientific, 2015); p. 576.
- [Bou23]
- N. Boumal. An Introduction to Optimization on Smooth Manifolds. First Edition (Cambridge University Press, 2023).
- [GS23]
- G. N. Grapiglia and G. F. Stella. An Adaptive Riemannian Gradient Method Without Function Evaluations. Journal of Optimization Theory and Applications 197, 1140–1160 (2023).
- [Hua14]
- W. Huang. Optimization algorithms on Riemannian manifolds with applications. Ph.D. Thesis, Flordia State University (2014).
- [IP17]
- B. Iannazzo and M. Porcelli. The Riemannian Barzilai–Borwein method with nonmonotone line search and the matrix geometric mean computation. IMA Journal of Numerical Analysis 38, 495–517 (2017).
- [NW06]
- J. Nocedal and S. J. Wright. Numerical Optimization. 2 Edition (Springer, New York, 2006).