Stepsize and Linesearch
Most iterative algorithms determine a direction along which the algorithm will proceed and determine a step size to find the next iterate. How advanced the step size computation can be implemented depends (among others) on the properties the corresponding problem provides.
Within Manopt.jl
, the step size determination is implemented as a functor
which is a subtype of [Stepsize
](@refbased on
Manopt.Stepsize
— TypeStepsize
An abstract type for the functors representing step sizes, i.e. they are callable structures. The naming scheme is TypeOfStepSize
, e.g. ConstantStepsize
.
Every Stepsize has to provide a constructor and its function has to have the interface (p,o,i)
where a AbstractManoptProblem
as well as AbstractManoptSolverState
and the current number of iterations are the arguments and returns a number, namely the stepsize to use.
See also
Usually, a constructor should take the manifold M
as its first argument, for consistency, to allow general step size functors to be set up based on default values that might depend on the manifold currently under consideration.
Currently, the following step sizes are available
Manopt.AdaptiveWNGradient
— TypeAdaptiveWNGradient <: DirectionUpdateRule
Represent an adaptive gradient method introduced by Grapiglia,Stella, J. Optim. Theory Appl., 2023.
Given a positive threshold $\hat c \mathbb N$, an minimal bound $b_{\mathrm{min}} > 0$, an initial $b_0 ≥ b_{\mathrm{min}}$, and a gradient reduction factor threshold ``\alpha \in [0,1).
Set $c_0=0$ and use $\omega_0 = \lVert \operatorname{grad} f(p_0) \rvert_{p_0}$.
For the first iterate we use the initial step size $s_0 = \frac{1}{b_0}$
Then, given the last gradient $X_{k1} = \operatorname{grad} f(x_{k1})$, and a previous $\omega_{k1}$, the values $(b_k, \omega_k, c_k)$ are computed using $X_k = \operatorname{grad} f(p_k)$ and the following cases
If $\lVert X_k \rVert_{p_k} \leq \alpha\omega_{k1}$, then let $\hat b_{k1} \in [b_\mathrm{min},b_{k1}]$ and set
\[(b_k, \omega_k, c_k) = \begin{cases} \bigl(\hat b_{k1}, \lVert X_k\rVert_{p_k}, 0 \bigr) & \text{ if } c_{k1}+1 = \hat c\\ \Bigl(b_{k1} + \frac{\lVert X_k\rVert_{p_k}^2}{b_{k1}}, \omega_{k1}, c_{k1}+1 \Bigr) & \text{ if } c_{k1}+1<\hat c \end{cases}\]
If $\lVert X_k \rVert_{p_k} > \alpha\omega_{k1}$, the set
\[(b_k, \omega_k, c_k) = \Bigl( b_{k1} + \frac{\lVert X_k\rVert_{p_k}^2}{b_{k1}}, \omega_{k1}, 0)\]
and return the step size $s_k = \frac{1}{b_k}$.
Note that for $α=0$ this is the Riemannian variant of WNGRad
Fields
count_threshold::Int
(4
) anInteger
for $\hat c$minimal_bound::Float64
(1e4
) for $b_{\mathrm{min}}$alternate_bound::Function
((bk, hat_c) > min(gradient_bound, max(gradient_bound, bk/(3*hat_c)
) how to determine $\hat b_k$ as a function of(bmin, bk, hat_c) > hat_bk
gradient_reduction::Float64
(0.9
)gradient_bound
norm(M, p0, grad_f(M,p0))
the bound $b_k$.
as well as the internal fields
weight
for $ω_k$ initialised to $ω_0 =$norm(M, p0, grad_f(M,p0))
if this is not zero,1.0
otherwise.count
for the $c_k$, initialised to $c_0 = 0$.
Constructor
AdaptiveWNGrad(M=DefaultManifold, grad_f=(M,p) > zero_vector(M,rand(M)), p=rand(M); kwargs...)
Where all above fields with defaults are keyword arguments. An additional keyword arguments
adaptive
(true
) switches thegradient_reduction
α
to
0`.evaluation
(AllocatingEvaluation()
) specifies whether the gradient (that is used for initialisation only) is mutating or allocating
Manopt.ArmijoLinesearch
— TypeArmijoLinesearch <: Linesearch
A functor representing Armijo line search including the last runs state, i.e. a last step size.
Fields
initial_stepsize
– (1.0
) and initial step sizeretraction_method
– (default_retraction_method(M)
) the rectraction to usecontraction_factor
– (0.95
) exponent for line search reductionsufficient_decrease
– (0.1
) gain within Armijo's rulelast_stepsize
– (initialstepsize
) the last step size we start the search withinitial_guess
 ((p,s,i,l) > l
) based on aAbstractManoptProblem
p
,AbstractManoptSolverState
s
and a current iteratei
and a last step sizel
, this returns an initial guess. The default uses the last obtained stepsize
Furthermore the following fields act as safeguards
stop_when_stepsize_less  (
0.0`) smallest stepsize when to stop (the last one before is taken)stop_when_stepsize_exceeds  ([
max_stepsize](@ref)
(M, p)`) – largest stepsize when to stop.stop_increasing_at_step
 (^100`) last step to increase the stepsize (phase 1),stop_decreasing_at_step
 (1000
) last step size to decrese the stepsize (phase 2),
Pass :Messages
to a debug=
to see @info
s when these happen.
Constructor
ArmijoLinesearch(M=DefaultManifold())
with the Fields above as keyword arguments and the retraction is set to the default retraction on M
.
The constructors return the functor to perform Armijo line search, where two interfaces are available:
 based on a tuple
(amp, ams, i)
of aAbstractManoptProblem
amp
,AbstractManoptSolverState
ams
and a current iteratei
.  with
(M, x, F, gradFx[,η=gradFx]) > s
whereM
, a current pointx
a functionF
, that maps from the manifold to the reals, its gradient (a tangent vector)gradFx
$=\operatorname{grad}F(x)$ atx
and an optional search direction tangent vectorη=gradFx
are the arguments.
Manopt.ConstantStepsize
— TypeConstantStepsize <: Stepsize
A functor that always returns a fixed step size.
Fields
length
– constant value for the step sizetype
 a symbol that indicates whether the stepsize is relatively (:relative), with respect to the gradient norm, or absolutely (:absolute) constant.
Constructors
ConstantStepsize(s::Real, t::Symbol=:relative)
initialize the stepsize to a constant s
of type t
.
ConstantStepsize(M::AbstractManifold=DefaultManifold(2);
stepsize=injectivity_radius(M)/2, type::Symbol=:relative
)
initialize the stepsize to a constant stepsize
, which by default is half the injectivity radius, unless the radius is infinity, then the default step size is 1
.
Manopt.DecreasingStepsize
— TypeDecreasingStepsize()
A functor that represents several decreasing step sizes
Fields
length
– (1
) the initial step size $l$.factor
– (1
) a value $f$ to multiply the initial step size with every iterationsubtrahend
– (0
) a value $a$ that is subtracted every iterationexponent
– (1
) a value $e$ the current iteration numbers $e$th exponential is taken ofshift
– (0
) shift the denominator iterator $i$ by $s$`.type
 a symbol that indicates whether the stepsize is relatively (:relative), with respect to the gradient norm, or absolutely (:absolute) constant.
In total the complete formulae reads for the $i$th iterate as
\[s_i = \frac{(l  i a)f^i}{(i+s)^e}\]
and hence the default simplifies to just $s_i = \frac{l}{i}$
Constructor
DecreasingStepsize(l=1,f=1,a=0,e=1,s=0,type=:relative)
Alternatively one can also use the following keyword.
DecreasingStepsize(
M::AbstractManifold=DefaultManifold(3);
length=injectivity_radius(M)/2, multiplier=1.0, subtrahend=0.0,
exponent=1.0, shift=0, type=:relative
)
initializes all fields above, where none of them is mandatory and the length is set to half and to $1$ if the injectivity radius is infinite.
Manopt.Linesearch
— TypeLinesearch <: Stepsize
An abstract functor to represent line search type step size deteminations, see Stepsize
for details. One example is the ArmijoLinesearch
functor.
Compared to simple step sizes, the linesearch functors provide an interface of the form (p,o,i,η) > s
with an additional (but optional) fourth parameter to provide a search direction; this should default to something reasonable, e.g. the negative gradient.
Manopt.NonmonotoneLinesearch
— TypeNonmonotoneLinesearch <: Linesearch
A functor representing a nonmonotone line search using the BarzilaiBorwein step size Iannazzo, Porcelli, IMA J. Numer. Anal., 2017. Together with a gradient descent algorithm this line search represents the Riemannian BarzilaiBorwein with nonmonotone linesearch (RBBNMLS) algorithm. We shifted the order of the algorithm steps from the paper by Iannazzo and Porcelli so that in each iteration we first find
\[y_{k} = \operatorname{grad}F(x_{k})  \operatorname{T}_{x_{k1} → x_k}(\operatorname{grad}F(x_{k1}))\]
and
\[s_{k} =  α_{k1} * \operatorname{T}_{x_{k1} → x_k}(\operatorname{grad}F(x_{k1})),\]
where $α_{k1}$ is the step size computed in the last iteration and $\operatorname{T}$ is a vector transport. We then find the Barzilai–Borwein step size
\[α_k^{\text{BB}} = \begin{cases} \min(α_{\text{max}}, \max(α_{\text{min}}, τ_{k})), & \text{if } ⟨s_{k}, y_{k}⟩_{x_k} > 0,\\ α_{\text{max}}, & \text{else,} \end{cases}\]
where
\[τ_{k} = \frac{⟨s_{k}, s_{k}⟩_{x_k}}{⟨s_{k}, y_{k}⟩_{x_k}},\]
if the direct strategy is chosen,
\[τ_{k} = \frac{⟨s_{k}, y_{k}⟩_{x_k}}{⟨y_{k}, y_{k}⟩_{x_k}},\]
in case of the inverse strategy and an alternation between the two in case of the alternating strategy. Then we find the smallest $h = 0, 1, 2, …$ such that
\[F(\operatorname{retr}_{x_k}( σ^h α_k^{\text{BB}} \operatorname{grad}F(x_k))) \leq \max_{1 ≤ j ≤ \min(k+1,m)} F(x_{k+1j})  γ σ^h α_k^{\text{BB}} ⟨\operatorname{grad}F(x_k), \operatorname{grad}F(x_k)⟩_{x_k},\]
where $σ$ is a step length reduction factor $∈ (0,1)$, $m$ is the number of iterations after which the function value has to be lower than the current one and $γ$ is the sufficient decrease parameter $∈(0,1)$. We can then find the new stepsize by
\[α_k = σ^h α_k^{\text{BB}}.\]
Fields
initial_stepsize
– (1.0
) the step size we start the search withmemory_size
– (10
) number of iterations after which the cost value needs to be lower than the current onebb_min_stepsize
– (1e3
) lower bound for the BarzilaiBorwein step size greater than zerobb_max_stepsize
– (1e3
) upper bound for the BarzilaiBorwein step size greater than min_stepsizeretraction_method
– (ExponentialRetraction()
) the rectraction to usestrategy
– (direct
) defines if the new step size is computed using the direct, indirect or alternating strategystorage
– (for:Iterate
and:Gradient
) aStoreStateAction
stepsize_reduction
– (0.5
) step size reduction factor contained in the interval (0,1)sufficient_decrease
– (1e4
) sufficient decrease parameter contained in the interval (0,1)vector_transport_method
– (ParallelTransport()
) the vector transport method to use
Furthermore the following fields act as safeguards
stop_when_stepsize_less  (
0.0`) smallest stepsize when to stop (the last one before is taken)stop_when_stepsize_exceeds  ([
max_stepsize](@ref)
(M, p)`) – largest stepsize when to stop.stop_increasing_at_step
 (^100`) last step to increase the stepsize (phase 1),stop_decreasing_at_step
 (1000
) last step size to decrese the stepsize (phase 2),
Pass :Messages
to a debug=
to see @info
s when these happen.
Constructor
NonmonotoneLinesearch()
with the Fields above in their order as optional arguments (deprecated).
NonmonotoneLinesearch(M)
with the Fields above in their order as keyword arguments and where the retraction and vector transport are set to the default ones on M
, repsectively.
The constructors return the functor to perform nonmonotone line search.
Manopt.WolfePowellBinaryLinesearch
— TypeWolfePowellBinaryLinesearch <: Linesearch
A Linesearch
method that determines a step size t
fulfilling the Wolfe conditions
based on a binary chop. Let $η$ be a search direction and $c1,c_2>0$ be two constants. Then with
\[A(t) = f(x_+) ≤ c1 t ⟨\operatorname{grad}f(x), η⟩_{x} \quad\text{and}\quad W(t) = ⟨\operatorname{grad}f(x_+), \text{V}_{x_+\gets x}η⟩_{x_+} ≥ c_2 ⟨η, \operatorname{grad}f(x)⟩_x,\]
where $x_+ = \operatorname{retr}_x(tη)$ is the current trial point, and $\text{V}$ is a vector transport, we perform the following Algorithm similar to Algorithm 7 from Huang, Thesis, 2014
 set $α=0$, $β=∞$ and $t=1$.
 While either $A(t)$ does not hold or $W(t)$ does not hold do steps 35.
 If $A(t)$ fails, set $β=t$.
 If $A(t)$ holds but $W(t)$ fails, set $α=t$.
 If $β<∞$ set $t=\frac{α+β}{2}$, otherwise set $t=2α$.
Constructors
There exist two constructors, where, when prodivind the manifold M
as a first (optional) parameter, its default retraction and vector transport are the default. In this case the retraction and the vector transport are also keyword arguments for ease of use. The other constructor is kept for backward compatibility.
WolfePowellLinesearch(
M=DefaultManifold(),
c1::Float64=10^(4),
c2::Float64=0.999;
retraction_method = default_retraction_method(M),
vector_transport_method = default_vector_transport(M),
linesearch_stopsize = 0.0
)
Manopt.WolfePowellLinesearch
— TypeWolfePowellLinesearch <: Linesearch
Do a backtracking linesearch to find a step size $α$ that fulfils the Wolfe conditions along a search direction $η$ starting from $x$, i.e.
\[f\bigl( \operatorname{retr}_x(αη) \bigr) ≤ f(x_k) + c_1 α_k ⟨\operatorname{grad}f(x), η⟩_x \quad\text{and}\quad \frac{\mathrm{d}}{\mathrm{d}t} f\bigr(\operatorname{retr}_x(tη)\bigr) \Big\vert_{t=α} ≥ c_2 \frac{\mathrm{d}}{\mathrm{d}t} f\bigl(\operatorname{retr}_x(tη)\bigr)\Big\vert_{t=0}.\]
Constructors
There exist two constructors, where, when prodivind the manifold M
as a first (optional) parameter, its default retraction and vector transport are the default. In this case the retraction and the vector transport are also keyword arguments for ease of use. The other constructor is kept for backward compatibility. Note that the linesearch_stopsize
to stop for too small stepsizes is only available in the new signature including M
.
WolfePowellLinesearch(
M,
c1::Float64=10^(4),
c2::Float64=0.999;
retraction_method = default_retraction_method(M),
vector_transport_method = default_vector_transport(M),
linesearch_stopsize = 0.0
)
Manopt.default_stepsize
— Methoddefault_stepsize(M::AbstractManifold, ams::AbstractManoptSolverState)
Returns the default Stepsize
functor used when running the solver specified by the AbstractManoptSolverState
ams
running with an objective on the AbstractManifold M
.
Manopt.get_stepsize
— Methodget_stepsize(amp::AbstractManoptProblem, ams::AbstractManoptSolverState, vars...)
return the stepsize stored within AbstractManoptSolverState
ams
when solving the AbstractManoptProblem
amp
. This method also works for decorated options and the Stepsize
function within the options, by default stored in o.stepsize
.
Manopt.linesearch_backtrack
— Method(s, msg) = linesearch_backtrack(
M, F, x, gradFx, s, decrease, contract, retr, η = gradFx, f0 = F(x);
stop_when_stepsize_less=0.0,
stop_when_stepsize_exceeds=max_stepsize(M, p),
stop_increasing_at_step = 100,
stop_decreasing_at_step = 1000,
)
perform a linesearch for
 a manifold
M
 a cost function
f
,  an iterate
p
 the gradient $\operatorname{grad}F(x)$
 an initial stepsize
s
usually called $γ$  a sufficient
decrease
 a
contract
ion factor $σ$  a
retr
action, which defaults to thedefault_retraction_method(M)
 a search direction $η = \operatorname{grad}F(x)$
 an offset, $f_0 = F(x)$
And use the 4 keywords to limit the maximal increase and decrease steps as well as a maximal stepsize (especially on nonHadamard manifolds) and a minimal one.
Return value
A stepsize s
and a message msg
(in case any of the 4 criteria hit)
Manopt.max_stepsize
— Methodmax_stepsize(M::AbstractManifold, p)
max_stepsize(M::AbstractManifold)
Get the maximum stepsize (at point p
) on manifold M
. It should be used to limit the distance an algorithm is trying to move in a single step.
Literature
 [GS23]

G. N. Grapiglia and G. F. Stella. An Adaptive Riemannian Gradient Method Without Function Evaluations. Journal of Optimization Theory and Applications 197, 1140–1160 (2023), preprint: [optimizationonline.org/wpcontent/uploads/2022/04/8864.pdf](https://optimizationonline.org/wpcontent/uploads/2022/04/8864.pdf).
 [Hua14]

W. Huang. Optimization algorithms on Riemannian manifolds with applications. Phd thesis, Flordia State University (2014).
 [IP17]

B. Iannazzo and M. Porcelli. The Riemannian Barzilai{\textendash}Borwein method with nonmonotone line search and the matrix geometric mean computation. IMA Journal of Numerical Analysis 38, 495–517 (2017).