Conjugate Gradient Descent
Manopt.conjugate_gradient_descent
— Functionconjugate_gradient_descent(M, F, ∇F, x)
perform a conjugate gradient based descent
where $\operatorname{retr}$ denotes a retraction on the Manifold
M
and one can employ different rules to update the descent direction $\delta_k$ based on the last direction $\delta_{k-1}$ and both gradients $\nabla f(x_k)$,$\nabla f(x_{k-1})$. The Stepsize
$s_k$ may be determined by a Linesearch
.
Available update rules are SteepestDirectionUpdateRule
, which yields a gradient_descent
, ConjugateDescentCoefficient
, DaiYuanCoefficient
, FletcherReevesCoefficient
, HagerZhangCoefficient
, HeestenesStiefelCoefficient
, LiuStoreyCoefficient
, and PolakRibiereCoefficient
.
They all compute $\beta_k$ such that this algorithm updates the search direction as
Input
M
: a manifold $\mathcal M$F
: a cost function $F\colon\mathcal M\to\mathbb R$ to minimize∇F
: the gradient $∇ F\colon\mathcal M\to T\mathcal M$ of Fx
: an initial value $x\in\mathcal M$
Optional
coefficient
: (SteepestDirectionUpdateRule
<:
DirectionUpdateRule
rule to compute the descent direction update coefficient $\beta_k$, as a functor, i.e. the resulting function maps(p,o,i) -> β
, wherep
is the currentGradientProblem
,o
are theConjugateGradientDescentOptions
o
andi
is the current iterate.retraction_method
- (ExponentialRetraction
) a retraction method to use, by default the exponntial mapreturn_options
– (false
) – if actiavated, the extended result, i.e. the completeOptions
re returned. This can be used to access recorded values. If set to false (default) just the optimal valuex_opt
if returnedstepsize
- (Constant(1.)
) AStepsize
function applied to the search direction. The default is a constant step size 1.stopping_criterion
: (stopWhenAny( stopAtIteration(200), stopGradientNormLess(10.0^-8))
) a function indicating when to stop.vector_transport_method
– (ParallelTransport()
) vector transport method to transport the old descent direction when computing the new descent direction.
Output
x_opt
– the resulting (approximately critical) point of gradientDescent
OR
options
- the options returned by the solver (seereturn_options
)
Manopt.conjugate_gradient_descent!
— Functionconjugate_gradient_descent!(M, F, ∇F, x)
perform a conjugate gradient based descent in place of x
, i.e.
where $\operatorname{retr}$ denotes a retraction on the Manifold
M
Input
M
: a manifold $\mathcal M$F
: a cost function $F\colon\mathcal M\to\mathbb R$ to minimize∇F
: the gradient $∇ F\colon\mathcal M\to T\mathcal M$ of Fx
: an initial value $x\in\mathcal M$
for more details and options, especially the DirectionUpdateRule
s, see conjugate_gradient_descent
.
Options
Manopt.ConjugateGradientDescentOptions
— TypeConjugateGradientOptions <: Options
specify options for a conjugate gradient descent algoritm, that solves a [GradientProblem
].
Fields
x
– the current iterate, a point on a manifold∇
– the current gradientδ
– the current descent direction, i.e. also tangent vectorβ
– the current update coefficient rule, see .coefficient
– aDirectionUpdateRule
function to determine the newβ
stepsize
– aStepsize
functionstop
– aStoppingCriterion
retraction_method
– (ExponentialRetraction()
) a type of retraction
See also
conjugate_gradient_descent
, GradientProblem
, ArmijoLinesearch
Available Coefficients
The update rules act as DirectionUpdateRule
, which internally always first evaluate the gradient itself.
Manopt.ConjugateDescentCoefficient
— TypeConjugateDescentCoefficient <: DirectionUpdateRule
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentOptions
o
include the last iterates $x_k,\xi_k$, the current iterates $x_{k+1},\xi_{k+1}$ and the last update direction $\delta=\delta_k$, where the last three ones are stored in the variables with prequel Old
based on [Flethcer1987] adapted to manifolds:
See also conjugate_gradient_descent
Constructor
ConjugateDescentCoefficient(a::StoreOptionsAction=())
Construct the conjugate descnt coefficient update rule, a new storage is created by default.
Manopt.DaiYuanCoefficient
— TypeDaiYuanCoefficient <: DirectionUpdateRule
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentOptions
o
include the last iterates $x_k,\xi_k$, the current iterates $x_{k+1},\xi_{k+1}$ and the last update direction $\delta=\delta_k$, where the last three ones are stored in the variables with prequel Old
based on [DaiYuan1999]
adapted to manifolds: let $\nu_k = \xi_{k+1} - P_{x_{k+1}\gets x_k}\xi_k$, where $P_{a\gets b}(\cdot)$ denotes a vector transport from the tangent space at $a$ to $b$.
Then the coefficient reads
See also conjugate_gradient_descent
Constructor
DaiYuanCoefficient(
t::AbstractVectorTransportMethod=ParallelTransport(),
a::StoreOptionsAction=(),
)
Construct the Dai Yuan coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.
Manopt.FletcherReevesCoefficient
— TypeFletcherReevesCoefficient <: DirectionUpdateRule
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentOptions
o
include the last iterates $x_k,\xi_k$, the current iterates $x_{k+1},\xi_{k+1}$ and the last update direction $\delta=\delta_k$, where the last three ones are stored in the variables with prequel Old
based on [FletcherReeves1964] adapted to manifolds:
See also conjugate_gradient_descent
Constructor
FletcherReevesCoefficient(a::StoreOptionsAction=())
Construct the Fletcher Reeves coefficient update rule, a new storage is created by default.
Manopt.HagerZhangCoefficient
— TypeHagerZhangCoefficient <: DirectionUpdateRule
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentOptions
o
include the last iterates $x_k,\xi_k$, the current iterates $x_{k+1},\xi_{k+1}$ and the last update direction $\delta=\delta_k$, where the last three ones are stored in the variables with prequel Old
based on [HagerZhang2005] adapted to manifolds: let $\nu_k = \xi_{k+1} - P_{x_{k+1}\gets x_k}\xi_k$, where $P_{a\gets b}(\cdot)$ denotes a vector transport from the tangent space at $a$ to $b$.
This method includes a numerical stability proposed by those authors.
See also conjugate_gradient_descent
Constructor
HagerZhangCoefficient(
t::AbstractVectorTransportMethod=ParallelTransport(),
a::StoreOptionsAction=(),
)
Construct the Hager Zhang coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.
Manopt.HeestenesStiefelCoefficient
— TypeHeestenesStiefelCoefficient <: DirectionUpdateRule
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentOptions
o
include the last iterates $x_k,\xi_k$, the current iterates $x_{k+1},\xi_{k+1}$ and the last update direction $\delta=\delta_k$, where the last three ones are stored in the variables with prequel Old
based on [HeestensStiefel1952]
adapted to manifolds as follows: let $\nu_k = \xi_{k+1} - P_{x_{k+1}\gets x_k}\xi_k$. Then the update reads
where $P_{a\gets b}(\cdot)$ denotes a vector transport from the tangent space at $a$ to $b$.
Constructor
HeestenesStiefelCoefficient(
t::AbstractVectorTransportMethod=ParallelTransport(),
a::StoreOptionsAction=()
)
Construct the Heestens Stiefel coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.
See also conjugate_gradient_descent
Manopt.LiuStoreyCoefficient
— TypeLiuStoreyCoefficient <: DirectionUpdateRule
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentOptions
o
include the last iterates $x_k,\xi_k$, the current iterates $x_{k+1},\xi_{k+1}$ and the last update direction $\delta=\delta_k$, where the last three ones are stored in the variables with prequel Old
based on [LuiStorey1991] adapted to manifolds: let $\nu_k = \xi_{k+1} - P_{x_{k+1}\gets x_k}\xi_k$, where $P_{a\gets b}(\cdot)$ denotes a vector transport from the tangent space at $a$ to $b$.
Then the coefficient reads
See also conjugate_gradient_descent
Constructor
LiuStoreyCoefficient(
t::AbstractVectorTransportMethod=ParallelTransport(),
a::StoreOptionsAction=()
)
Construct the Lui Storey coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.
Manopt.PolakRibiereCoefficient
— TypePolakRibiereCoefficient <: DirectionUpdateRule
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentOptions
o
include the last iterates $x_k,\xi_k$, the current iterates $x_{k+1},\xi_{k+1}$ and the last update direction $\delta=\delta_k$, where the last three ones are stored in the variables with prequel Old
based on [PolakRibiere1969][Polyak1969]
adapted to manifolds: let $\nu_k = \xi_{k+1} - P_{x_{k+1}\gets x_k}\xi_k$, where $P_{a\gets b}(\cdot)$ denotes a vector transport from the tangent space at $a$ to $b$.
Then the update reads
Constructor
PolakRibiereCoefficient(
t::AbstractVectorTransportMethod=ParallelTransport(),
a::StoreOptionsAction=()
)
Construct the PolakRibiere coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.
See also conjugate_gradient_descent
Manopt.SteepestDirectionUpdateRule
— TypeSteepestDirectionUpdateRule <: DirectionUpdateRule
The simplest rule to update is to have no influence of the last direction and hence return an update $\beta = 0$ for all ConjugateGradientDescentOptions
o
See also conjugate_gradient_descent
Literature
- Flethcer1987
R. Fletcher, Practical Methods of Optimization vol. 1: Unconstrained Optimization John Wiley & Sons, New York, 1987. doi 10.1137/1024028
- DaiYuan1999
[Y. H. Dai and Y. Yuan, A nonlinear conjugate gradient method with a strong global convergence property, SIAM J. Optim., 10 (1999), pp. 177–182. doi: 10.1137/S1052623497318992
- FletcherReeves1964
R. Fletcher and C. Reeves, Function minimization by conjugate gradients, Comput. J., 7 (1964), pp. 149–154. doi: 10.1093/comjnl/7.2.149
- HagerZhang2005
[W. W. Hager and H. Zhang, A new conjugate gradient method with guaranteed descent and an efficient line search, SIAM J. Optim, (16), pp. 170-192, 2005. doi: 10.1137/030601880
- HeestensStiefel1952
M.R. Hestenes, E.L. Stiefel, Methods of conjugate gradients for solving linear systems, J. Research Nat. Bur. Standards, 49 (1952), pp. 409–436. doi: 10.6028/jres.049.044
- LuiStorey1991
[Y. Liu and C. Storey, Efficient generalized conjugate gradient algorithms, Part 1: Theory J. Optim. Theory Appl., 69 (1991), pp. 129–137. doi: 10.1007/BF00940464
- PolakRibiere1969
E. Polak, G. Ribiere, Note sur la convergence de méthodes de directions conjuguées ESAIM: Mathematical Modelling and Numerical Analysis - Modélisation Mathématique et Analyse Numérique, Tome 3 (1969) no. R1, p. 35-43, url: http://www.numdam.org/item/?id=M2AN1969__31350
- Polyak1969
B. T. Polyak, The conjugate gradient method in extreme problems, USSR Comp. Math. Math. Phys., 9 (1969), pp. 94–112. doi: 10.1016/0041-5553(69)90035-4