Conjugate Gradient Descent

Manopt.conjugate_gradient_descentFunction
conjugate_gradient_descent(M, F, ∇F, x)

perform a conjugate gradient based descent

\[x_{k+1} = \operatorname{retr}_{x_k} \bigl( s_k\delta_k \bigr),\]

where $\operatorname{retr}$ denotes a retraction on the Manifold M and one can employ different rules to update the descent direction $\delta_k$ based on the last direction $\delta_{k-1}$ and both gradients $\nabla f(x_k)$,$\nabla f(x_{k-1})$. The Stepsize $s_k$ may be determined by a Linesearch.

Available update rules are SteepestDirectionUpdateRule, which yields a gradient_descent, ConjugateDescentCoefficient, DaiYuanCoefficient, FletcherReevesCoefficient, HagerZhangCoefficient, HeestenesStiefelCoefficient, LiuStoreyCoefficient, and PolakRibiereCoefficient.

They all compute $\beta_k$ such that this algorithm updates the search direction as

\[\delta_k=\nabla f(x_k) + \beta_k \delta_{k-1}\]

Input

  • M : a manifold $\mathcal M$
  • F : a cost function $F\colon\mathcal M\to\mathbb R$ to minimize
  • ∇F: the gradient $∇ F\colon\mathcal M\to T\mathcal M$ of F
  • x : an initial value $x\in\mathcal M$

Optional

  • coefficient : (SteepestDirectionUpdateRule <: DirectionUpdateRule rule to compute the descent direction update coefficient $\beta_k$, as a functor, i.e. the resulting function maps (p,o,i) -> β, where p is the current GradientProblem, o are the ConjugateGradientDescentOptions o and i is the current iterate.
  • retraction_method - (ExponentialRetraction) a retraction method to use, by default the exponntial map
  • return_options – (false) – if actiavated, the extended result, i.e. the complete Options re returned. This can be used to access recorded values. If set to false (default) just the optimal value x_opt if returned
  • stepsize - (Constant(1.)) A Stepsize function applied to the search direction. The default is a constant step size 1.
  • stopping_criterion : (stopWhenAny( stopAtIteration(200), stopGradientNormLess(10.0^-8))) a function indicating when to stop.
  • vector_transport_method – (ParallelTransport()) vector transport method to transport the old descent direction when computing the new descent direction.

Output

  • x_opt – the resulting (approximately critical) point of gradientDescent

OR

  • options - the options returned by the solver (see return_options)
source
Manopt.conjugate_gradient_descent!Function
conjugate_gradient_descent!(M, F, ∇F, x)

perform a conjugate gradient based descent in place of x, i.e.

\[x_{k+1} = \operatorname{retr}_{x_k} \bigl( s_k\delta_k \bigr),\]

where $\operatorname{retr}$ denotes a retraction on the Manifold M

Input

  • M : a manifold $\mathcal M$
  • F : a cost function $F\colon\mathcal M\to\mathbb R$ to minimize
  • ∇F: the gradient $∇ F\colon\mathcal M\to T\mathcal M$ of F
  • x : an initial value $x\in\mathcal M$

for more details and options, especially the DirectionUpdateRules, see conjugate_gradient_descent.

source

Options

Manopt.ConjugateGradientDescentOptionsType
ConjugateGradientOptions <: Options

specify options for a conjugate gradient descent algoritm, that solves a [GradientProblem].

Fields

  • x – the current iterate, a point on a manifold
  • – the current gradient
  • δ – the current descent direction, i.e. also tangent vector
  • β – the current update coefficient rule, see .
  • coefficient – a DirectionUpdateRule function to determine the new β
  • stepsize – a Stepsize function
  • stop – a StoppingCriterion
  • retraction_method – (ExponentialRetraction()) a type of retraction

See also

conjugate_gradient_descent, GradientProblem, ArmijoLinesearch

source

Available Coefficients

The update rules act as DirectionUpdateRule, which internally always first evaluate the gradient itself.

Manopt.ConjugateDescentCoefficientType
ConjugateDescentCoefficient <: DirectionUpdateRule

Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentOptionso include the last iterates $x_k,\xi_k$, the current iterates $x_{k+1},\xi_{k+1}$ and the last update direction $\delta=\delta_k$, where the last three ones are stored in the variables with prequel Old based on [Flethcer1987] adapted to manifolds:

\[\beta_k = \frac{ \lVert \xi_{k+1} \rVert_{x_{k+1}}^2 } {\langle -\delta_k,\xi_k \rangle_{x_k}}.\]

See also conjugate_gradient_descent

Constructor

ConjugateDescentCoefficient(a::StoreOptionsAction=())

Construct the conjugate descnt coefficient update rule, a new storage is created by default.

source
Manopt.DaiYuanCoefficientType
DaiYuanCoefficient <: DirectionUpdateRule

Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentOptionso include the last iterates $x_k,\xi_k$, the current iterates $x_{k+1},\xi_{k+1}$ and the last update direction $\delta=\delta_k$, where the last three ones are stored in the variables with prequel Old based on [DaiYuan1999]

adapted to manifolds: let $\nu_k = \xi_{k+1} - P_{x_{k+1}\gets x_k}\xi_k$, where $P_{a\gets b}(\cdot)$ denotes a vector transport from the tangent space at $a$ to $b$.

Then the coefficient reads

\[\beta_k = \frac{ \lVert \xi_{k+1} \rVert_{x_{k+1}}^2 } {\langle P_{x_{k+1}\gets x_k}\delta_k, \nu_k \rangle_{x_{k+1}}}.\]

See also conjugate_gradient_descent

Constructor

DaiYuanCoefficient(
    t::AbstractVectorTransportMethod=ParallelTransport(),
    a::StoreOptionsAction=(),
)

Construct the Dai Yuan coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.

source
Manopt.FletcherReevesCoefficientType
FletcherReevesCoefficient <: DirectionUpdateRule

Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentOptionso include the last iterates $x_k,\xi_k$, the current iterates $x_{k+1},\xi_{k+1}$ and the last update direction $\delta=\delta_k$, where the last three ones are stored in the variables with prequel Old based on [FletcherReeves1964] adapted to manifolds:

\[\beta_k = \frac{\lVert \xi_{k+1}\rVert_{x_{k+1}}^2}{\lVert \xi_{k}\rVert_{x_{k}}^2}.\]

See also conjugate_gradient_descent

Constructor

FletcherReevesCoefficient(a::StoreOptionsAction=())

Construct the Fletcher Reeves coefficient update rule, a new storage is created by default.

source
Manopt.HagerZhangCoefficientType
HagerZhangCoefficient <: DirectionUpdateRule

Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentOptionso include the last iterates $x_k,\xi_k$, the current iterates $x_{k+1},\xi_{k+1}$ and the last update direction $\delta=\delta_k$, where the last three ones are stored in the variables with prequel Old based on [HagerZhang2005] adapted to manifolds: let $\nu_k = \xi_{k+1} - P_{x_{k+1}\gets x_k}\xi_k$, where $P_{a\gets b}(\cdot)$ denotes a vector transport from the tangent space at $a$ to $b$.

\[\beta_k = \Bigl\langle\nu_k - \frac{ 2\lVert \nu_k\rVert_{x_{k+1}}^2 }{ \langle P_{x_{k+1}\gets x_k}\delta_k, \nu_k \rangle_{x_{k+1}} } P_{x_{k+1}\gets x_k}\delta_k, \frac{\xi_{k+1}}{ \langle P_{x_{k+1}\gets x_k}\delta_k, \nu_k \rangle_{x_{k+1}} } \Bigr\rangle_{x_{k+1}}.\]

This method includes a numerical stability proposed by those authors.

See also conjugate_gradient_descent

Constructor

HagerZhangCoefficient(
    t::AbstractVectorTransportMethod=ParallelTransport(),
    a::StoreOptionsAction=(),
)

Construct the Hager Zhang coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.

source
Manopt.HeestenesStiefelCoefficientType
HeestenesStiefelCoefficient <: DirectionUpdateRule

Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentOptionso include the last iterates $x_k,\xi_k$, the current iterates $x_{k+1},\xi_{k+1}$ and the last update direction $\delta=\delta_k$, where the last three ones are stored in the variables with prequel Old based on [HeestensStiefel1952]

adapted to manifolds as follows: let $\nu_k = \xi_{k+1} - P_{x_{k+1}\gets x_k}\xi_k$. Then the update reads

\[\beta_k = \frac{\langle \xi_{k+1}, \nu_k \rangle_{x_{k+1}} } { \langle P_{x_{k+1}\gets x_k} \delta_k, \nu_k\rangle_{x_{k+1}} },\]

where $P_{a\gets b}(\cdot)$ denotes a vector transport from the tangent space at $a$ to $b$.

Constructor

HeestenesStiefelCoefficient(
    t::AbstractVectorTransportMethod=ParallelTransport(),
    a::StoreOptionsAction=()
)

Construct the Heestens Stiefel coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.

See also conjugate_gradient_descent

source
Manopt.LiuStoreyCoefficientType
LiuStoreyCoefficient <: DirectionUpdateRule

Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentOptionso include the last iterates $x_k,\xi_k$, the current iterates $x_{k+1},\xi_{k+1}$ and the last update direction $\delta=\delta_k$, where the last three ones are stored in the variables with prequel Old based on [LuiStorey1991] adapted to manifolds: let $\nu_k = \xi_{k+1} - P_{x_{k+1}\gets x_k}\xi_k$, where $P_{a\gets b}(\cdot)$ denotes a vector transport from the tangent space at $a$ to $b$.

Then the coefficient reads

\[\beta_k = - \frac{ \langle \xi_{k+1},\nu_k \rangle_{x_{k+1}} } {\langle \delta_k,\xi_k \rangle_{x_k}}.\]

See also conjugate_gradient_descent

Constructor

LiuStoreyCoefficient(
    t::AbstractVectorTransportMethod=ParallelTransport(),
    a::StoreOptionsAction=()
)

Construct the Lui Storey coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.

source
Manopt.PolakRibiereCoefficientType
PolakRibiereCoefficient <: DirectionUpdateRule

Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentOptionso include the last iterates $x_k,\xi_k$, the current iterates $x_{k+1},\xi_{k+1}$ and the last update direction $\delta=\delta_k$, where the last three ones are stored in the variables with prequel Old based on [PolakRibiere1969][Polyak1969]

adapted to manifolds: let $\nu_k = \xi_{k+1} - P_{x_{k+1}\gets x_k}\xi_k$, where $P_{a\gets b}(\cdot)$ denotes a vector transport from the tangent space at $a$ to $b$.

Then the update reads

\[\beta_k = \frac{ \langle \xi_{k+1}, \nu_k \rangle_{x_{k+1}} } {\lVert \xi_k \rVert_{x_k}^2 }.\]

Constructor

PolakRibiereCoefficient(
    t::AbstractVectorTransportMethod=ParallelTransport(),
    a::StoreOptionsAction=()
)

Construct the PolakRibiere coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.

See also conjugate_gradient_descent

source

Literature