Augmented Lagrangian method
Manopt.augmented_Lagrangian_method
— Functionaugmented_Lagrangian_method(M, f, grad_f, p=rand(M); kwargs...)
augmented_Lagrangian_method(M, cmo::ConstrainedManifoldObjective, p=rand(M); kwargs...)
augmented_Lagrangian_method!(M, f, grad_f, p; kwargs...)
augmented_Lagrangian_method!(M, cmo::ConstrainedManifoldObjective, p; kwargs...)
perform the augmented Lagrangian method (ALM) [LB19]. This method can work in-place of p
.
The aim of the ALM is to find the solution of the constrained optimisation task
\[\begin{aligned} \operatorname*{arg\,min}_{p ∈ \mathcal M} & f(p)\\ \text{subject to}\quad&g_i(p) ≤ 0 \quad \text{ for } i= 1, …, m,\\ \quad & h_j(p)=0 \quad \text{ for } j=1,…,n, \end{aligned}\]
where M
is a Riemannian manifold, and $f$, $\{g_i\}_{i=1}^{n}$ and $\{h_j\}_{j=1}^{m}$ are twice continuously differentiable functions from M
to ℝ. In every step $k$ of the algorithm, the AugmentedLagrangianCost
$\mathcal L_{ρ^{(k)}}(p, μ^{(k)}, λ^{(k)})$ is minimized on \mathcal M, where $μ^{(k)} ∈ ℝ^n$ and $λ^{(k)} ∈ ℝ^m$ are the current iterates of the Lagrange multipliers and $ρ^{(k)}$ is the current penalty parameter.
The Lagrange multipliers are then updated by
\[λ_j^{(k+1)} =\operatorname{clip}_{[λ_{\min},λ_{\max}]} (λ_j^{(k)} + ρ^{(k)} h_j(p^{(k+1)})) \text{for all} j=1,…,p,\]
and
\[μ_i^{(k+1)} =\operatorname{clip}_{[0,μ_{\max}]} (μ_i^{(k)} + ρ^{(k)} g_i(p^{(k+1)})) \text{ for all } i=1,…,m,\]
where $λ_{\text{min}} ≤ λ_{\text{max}}$ and $μ_{\text{max}}$ are the multiplier boundaries.
Next, the accuracy tolerance $ϵ$ is updated as
\[ϵ^{(k)}=\max\{ϵ_{\min}, θ_ϵ ϵ^{(k-1)}\},\]
where $ϵ_{\text{min}}$ is the lowest value $ϵ$ is allowed to become and $θ_ϵ ∈ (0,1)$ is constant scaling factor.
Last, the penalty parameter $ρ$ is updated as follows: with
\[σ^{(k)}=\max_{j=1,…,p, i=1,…,m} \{\|h_j(p^{(k)})\|, \|\max_{i=1,…,m}\{g_i(p^{(k)}), -\frac{μ_i^{(k-1)}}{ρ^{(k-1)}} \}\| \}.\]
ρ
is updated as
\[ρ^{(k)} = \begin{cases} ρ^{(k-1)}/θ_ρ, & \text{if } σ^{(k)}\leq θ_ρ σ^{(k-1)} ,\\ ρ^{(k-1)}, & \text{else,} \end{cases}\]
where $θ_ρ ∈ (0,1)$ is a constant scaling factor.
Input
M::
AbstractManifold
: a Riemannian manifold $\mathcal M$
f
: a cost function $f: \mathcal M→ ℝ$ implemented as(M, p) -> v
grad_f
: the (Riemannian) gradient $\operatorname{grad}f: \mathcal M → T_{p}\mathcal M$ of f as a function(M, p) -> X
or a function(M, X, p) -> X
computingX
in-place
Optional (if not called with the ConstrainedManifoldObjective
cmo
)
g=nothing
: the inequality constraintsh=nothing
: the equality constraintsgrad_g=nothing
: the gradient of the inequality constraintsgrad_h=nothing
: the gradient of the equality constraints
Note that one of the pairs (g
, grad_g
) or (h
, grad_h
) has to be provided. Otherwise the problem is not constrained and a better solver would be for example quasi_Newton
.
Keyword Arguments
evaluation=
AllocatingEvaluation
()
: specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation
) or whether they modify their input argument to return the result therein (InplaceEvaluation
). Since usually the first argument is the manifold, the modified argument is the second.ϵ=1e-3
: the accuracy toleranceϵ_min=1e-6
: the lower bound for the accuracy toleranceϵ_exponent=1/100
: exponent of the ϵ update factor; also 1/number of iterations until maximal accuracy is needed to end algorithm naturallyequality_constraints=nothing
: the number $n$ of equality constraints.
If not provided, a call to the gradient of
g
is performed to estimate these.gradient_range=nothing
: specify how both gradients of the constraints are representedgradient_equality_range=gradient_range
: specify how gradients of the equality constraints are represented, seeVectorGradientFunction
.gradient_inequality_range=gradient_range
: specify how gradients of the inequality constraints are represented, seeVectorGradientFunction
.inequality_constraints=nothing
: the number $m$ of inequality constraints. If not provided, a call to the gradient ofg
is performed to estimate these.λ=ones(size(h(M,x),1))
: the Lagrange multiplier with respect to the equality constraintsλ_max=20.0
: an upper bound for the Lagrange multiplier belonging to the equality constraintsλ_min=- λ_max
: a lower bound for the Lagrange multiplier belonging to the equality constraintsμ=ones(size(h(M,x),1))
: the Lagrange multiplier with respect to the inequality constraintsμ_max=20.0
: an upper bound for the Lagrange multiplier belonging to the inequality constraintsρ=1.0
: the penalty parameterτ=0.8
: factor for the improvement of the evaluation of the penalty parameterθ_ρ=0.3
: the scaling factor of the penalty parameterθ_ϵ=(ϵ_min / ϵ)^(ϵ_exponent)
: the scaling factor of the exactnesssub_cost=[
AugmentedLagrangianCost± (@ref)(cmo, ρ, μ, λ):
use augmented Lagrangian cost, based on theConstrainedManifoldObjective
build from the functions provided. This is used to define thesub_problem=
keyword and has hence no effect, if you setsub_problem
directly.sub_grad=[
AugmentedLagrangianGrad](@ref)
(cmo, ρ, μ, λ): use augmented Lagrangian gradient, based on the [
ConstrainedManifoldObjective](@ref) build from the functions provided. This is used to define the
subproblem=keyword and has hence no effect, if you set
subproblem` directly.sub_kwargs=
(;)
: a named tuple of keyword arguments that are passed todecorate_objective!
of the sub solvers objective, thedecorate_state!
of the subsovlers state, and the sub state constructor itself.stopping_criterion=
StopAfterIteration
(300)
|
(`StopWhenSmallerOrEqual(:ϵ, ϵ_min)
&
StopWhenChangeLess
(1e-10) )[
|](@ref StopWhenAny)[
StopWhenChangeLess](@ref)
: a functor indicating that the stopping criterion is fulfilledsub_problem=
DefaultManoptProblem
(M, sub_objective)
: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.sub_state=
QuasiNewtonState
: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.as the quasi newton method, theQuasiNewtonLimitedMemoryDirectionUpdate
withInverseBFGS
is used.`substoppingcriterion::StoppingCriterion=StopAfterIteration(300) | StopWhenGradientNormLess(ϵ) | StopWhenStepsizeLess(1e-8),
For the range
s of the constraints' gradient, other power manifold tangent space representations, mainly the ArrayPowerRepresentation
can be used if the gradients can be computed more efficiently in that representation.
All other keyword arguments are passed to decorate_state!
for state decorators or decorate_objective!
for objective decorators, respectively.
Output
The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return
for details, especially the return_state=
keyword.
Manopt.augmented_Lagrangian_method!
— Functionaugmented_Lagrangian_method(M, f, grad_f, p=rand(M); kwargs...)
augmented_Lagrangian_method(M, cmo::ConstrainedManifoldObjective, p=rand(M); kwargs...)
augmented_Lagrangian_method!(M, f, grad_f, p; kwargs...)
augmented_Lagrangian_method!(M, cmo::ConstrainedManifoldObjective, p; kwargs...)
perform the augmented Lagrangian method (ALM) [LB19]. This method can work in-place of p
.
The aim of the ALM is to find the solution of the constrained optimisation task
\[\begin{aligned} \operatorname*{arg\,min}_{p ∈ \mathcal M} & f(p)\\ \text{subject to}\quad&g_i(p) ≤ 0 \quad \text{ for } i= 1, …, m,\\ \quad & h_j(p)=0 \quad \text{ for } j=1,…,n, \end{aligned}\]
where M
is a Riemannian manifold, and $f$, $\{g_i\}_{i=1}^{n}$ and $\{h_j\}_{j=1}^{m}$ are twice continuously differentiable functions from M
to ℝ. In every step $k$ of the algorithm, the AugmentedLagrangianCost
$\mathcal L_{ρ^{(k)}}(p, μ^{(k)}, λ^{(k)})$ is minimized on \mathcal M, where $μ^{(k)} ∈ ℝ^n$ and $λ^{(k)} ∈ ℝ^m$ are the current iterates of the Lagrange multipliers and $ρ^{(k)}$ is the current penalty parameter.
The Lagrange multipliers are then updated by
\[λ_j^{(k+1)} =\operatorname{clip}_{[λ_{\min},λ_{\max}]} (λ_j^{(k)} + ρ^{(k)} h_j(p^{(k+1)})) \text{for all} j=1,…,p,\]
and
\[μ_i^{(k+1)} =\operatorname{clip}_{[0,μ_{\max}]} (μ_i^{(k)} + ρ^{(k)} g_i(p^{(k+1)})) \text{ for all } i=1,…,m,\]
where $λ_{\text{min}} ≤ λ_{\text{max}}$ and $μ_{\text{max}}$ are the multiplier boundaries.
Next, the accuracy tolerance $ϵ$ is updated as
\[ϵ^{(k)}=\max\{ϵ_{\min}, θ_ϵ ϵ^{(k-1)}\},\]
where $ϵ_{\text{min}}$ is the lowest value $ϵ$ is allowed to become and $θ_ϵ ∈ (0,1)$ is constant scaling factor.
Last, the penalty parameter $ρ$ is updated as follows: with
\[σ^{(k)}=\max_{j=1,…,p, i=1,…,m} \{\|h_j(p^{(k)})\|, \|\max_{i=1,…,m}\{g_i(p^{(k)}), -\frac{μ_i^{(k-1)}}{ρ^{(k-1)}} \}\| \}.\]
ρ
is updated as
\[ρ^{(k)} = \begin{cases} ρ^{(k-1)}/θ_ρ, & \text{if } σ^{(k)}\leq θ_ρ σ^{(k-1)} ,\\ ρ^{(k-1)}, & \text{else,} \end{cases}\]
where $θ_ρ ∈ (0,1)$ is a constant scaling factor.
Input
M::
AbstractManifold
: a Riemannian manifold $\mathcal M$
f
: a cost function $f: \mathcal M→ ℝ$ implemented as(M, p) -> v
grad_f
: the (Riemannian) gradient $\operatorname{grad}f: \mathcal M → T_{p}\mathcal M$ of f as a function(M, p) -> X
or a function(M, X, p) -> X
computingX
in-place
Optional (if not called with the ConstrainedManifoldObjective
cmo
)
g=nothing
: the inequality constraintsh=nothing
: the equality constraintsgrad_g=nothing
: the gradient of the inequality constraintsgrad_h=nothing
: the gradient of the equality constraints
Note that one of the pairs (g
, grad_g
) or (h
, grad_h
) has to be provided. Otherwise the problem is not constrained and a better solver would be for example quasi_Newton
.
Keyword Arguments
evaluation=
AllocatingEvaluation
()
: specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation
) or whether they modify their input argument to return the result therein (InplaceEvaluation
). Since usually the first argument is the manifold, the modified argument is the second.ϵ=1e-3
: the accuracy toleranceϵ_min=1e-6
: the lower bound for the accuracy toleranceϵ_exponent=1/100
: exponent of the ϵ update factor; also 1/number of iterations until maximal accuracy is needed to end algorithm naturallyequality_constraints=nothing
: the number $n$ of equality constraints.
If not provided, a call to the gradient of
g
is performed to estimate these.gradient_range=nothing
: specify how both gradients of the constraints are representedgradient_equality_range=gradient_range
: specify how gradients of the equality constraints are represented, seeVectorGradientFunction
.gradient_inequality_range=gradient_range
: specify how gradients of the inequality constraints are represented, seeVectorGradientFunction
.inequality_constraints=nothing
: the number $m$ of inequality constraints. If not provided, a call to the gradient ofg
is performed to estimate these.λ=ones(size(h(M,x),1))
: the Lagrange multiplier with respect to the equality constraintsλ_max=20.0
: an upper bound for the Lagrange multiplier belonging to the equality constraintsλ_min=- λ_max
: a lower bound for the Lagrange multiplier belonging to the equality constraintsμ=ones(size(h(M,x),1))
: the Lagrange multiplier with respect to the inequality constraintsμ_max=20.0
: an upper bound for the Lagrange multiplier belonging to the inequality constraintsρ=1.0
: the penalty parameterτ=0.8
: factor for the improvement of the evaluation of the penalty parameterθ_ρ=0.3
: the scaling factor of the penalty parameterθ_ϵ=(ϵ_min / ϵ)^(ϵ_exponent)
: the scaling factor of the exactnesssub_cost=[
AugmentedLagrangianCost± (@ref)(cmo, ρ, μ, λ):
use augmented Lagrangian cost, based on theConstrainedManifoldObjective
build from the functions provided. This is used to define thesub_problem=
keyword and has hence no effect, if you setsub_problem
directly.sub_grad=[
AugmentedLagrangianGrad](@ref)
(cmo, ρ, μ, λ): use augmented Lagrangian gradient, based on the [
ConstrainedManifoldObjective](@ref) build from the functions provided. This is used to define the
subproblem=keyword and has hence no effect, if you set
subproblem` directly.sub_kwargs=
(;)
: a named tuple of keyword arguments that are passed todecorate_objective!
of the sub solvers objective, thedecorate_state!
of the subsovlers state, and the sub state constructor itself.stopping_criterion=
StopAfterIteration
(300)
|
(`StopWhenSmallerOrEqual(:ϵ, ϵ_min)
&
StopWhenChangeLess
(1e-10) )[
|](@ref StopWhenAny)[
StopWhenChangeLess](@ref)
: a functor indicating that the stopping criterion is fulfilledsub_problem=
DefaultManoptProblem
(M, sub_objective)
: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.sub_state=
QuasiNewtonState
: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.as the quasi newton method, theQuasiNewtonLimitedMemoryDirectionUpdate
withInverseBFGS
is used.`substoppingcriterion::StoppingCriterion=StopAfterIteration(300) | StopWhenGradientNormLess(ϵ) | StopWhenStepsizeLess(1e-8),
For the range
s of the constraints' gradient, other power manifold tangent space representations, mainly the ArrayPowerRepresentation
can be used if the gradients can be computed more efficiently in that representation.
All other keyword arguments are passed to decorate_state!
for state decorators or decorate_objective!
for objective decorators, respectively.
Output
The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return
for details, especially the return_state=
keyword.
State
Manopt.AugmentedLagrangianMethodState
— TypeAugmentedLagrangianMethodState{P,T} <: AbstractManoptSolverState
Describes the augmented Lagrangian method, with
Fields
a default value is given in brackets if a parameter can be left out in initialization.
ϵ
: the accuracy toleranceϵ_min
: the lower bound for the accuracy toleranceλ
: the Lagrange multiplier with respect to the equality constraintsλ_max
: an upper bound for the Lagrange multiplier belonging to the equality constraintsλ_min
: a lower bound for the Lagrange multiplier belonging to the equality constraintsp::P
: a point on the manifold $\mathcal M$storing the current iteratepenalty
: evaluation of the current penalty term, initialized toInf
.μ
: the Lagrange multiplier with respect to the inequality constraintsμ_max
: an upper bound for the Lagrange multiplier belonging to the inequality constraintsρ
: the penalty parametersub_problem::Union{AbstractManoptProblem, F}
: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.sub_state::Union{AbstractManoptProblem, F}
: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.τ
: factor for the improvement of the evaluation of the penalty parameterθ_ρ
: the scaling factor of the penalty parameterθ_ϵ
: the scaling factor of the accuracy tolerancestop::StoppingCriterion
: a functor indicating that the stopping criterion is fulfilled
Constructor
AugmentedLagrangianMethodState(M::AbstractManifold, co::ConstrainedManifoldObjective,
sub_problem, sub_state; kwargs...
)
construct an augmented Lagrangian method options, where the manifold M
and the ConstrainedManifoldObjective
co
are used for manifold- or objective specific defaults.
AugmentedLagrangianMethodState(M::AbstractManifold, co::ConstrainedManifoldObjective,
sub_problem; evaluation=AllocatingEvaluation(), kwargs...
)
construct an augmented Lagrangian method options, where the manifold M
and the ConstrainedManifoldObjective
co
are used for manifold- or objective specific defaults, and sub_problem
is a closed form solution with evaluation
as type of evaluation.
Keyword arguments
the following keyword arguments are available to initialise the corresponding fields
ϵ=1e–3
ϵ_min=1e-6
λ=ones(n)
:n
is the number of equality constraints in theConstrainedManifoldObjective
co
.λ_max=20.0
λ_min=- λ_max
μ=ones(m)
:m
is the number of inequality constraints in theConstrainedManifoldObjective
co
.μ_max=20.0
p=
rand
(M)
: a point on the manifold $\mathcal M$to specify the initial valueρ=1.0
τ=0.8
θ_ρ=0.3
θ_ϵ=(ϵ_min/ϵ)^(ϵ_exponent)
- stoppingcriterion=
StopAfterIteration
(300)
|
(`StopWhenSmallerOrEqual`(:ϵ, ϵmin)[
&](@ref StopWhenAll)[
StopWhenChangeLess](@ref)
(1e-10) )|
StopWhenChangeLess
`.
See also
Helping functions
Manopt.AugmentedLagrangianCost
— TypeAugmentedLagrangianCost{CO,R,T}
Stores the parameters $ρ ∈ ℝ$, $μ ∈ ℝ^m$, $λ ∈ ℝ^n$ of the augmented Lagrangian associated to the ConstrainedManifoldObjective
co
.
This struct is also a functor (M,p) -> v
that can be used as a cost function within a solver, based on the internal ConstrainedManifoldObjective
it computes
\[\mathcal L_\rho(p, μ, λ) = f(x) + \frac{ρ}{2} \biggl( \sum_{j=1}^n \Bigl( h_j(p) + \frac{λ_j}{ρ} \Bigr)^2 + \sum_{i=1}^m \max\Bigl\{ 0, \frac{μ_i}{ρ} + g_i(p) \Bigr\}^2 \Bigr)\]
Fields
co::CO
,ρ::R
,μ::T
,λ::T
as mentioned in the formula, where $R$ should be the
number type used and $T$ the vector type.
Constructor
AugmentedLagrangianCost(co, ρ, μ, λ)
Manopt.AugmentedLagrangianGrad
— TypeAugmentedLagrangianGrad{CO,R,T} <: AbstractConstrainedFunctor{T}
Stores the parameters $ρ ∈ ℝ$, $μ ∈ ℝ^m$, $λ ∈ ℝ^n$ of the augmented Lagrangian associated to the ConstrainedManifoldObjective
co
.
This struct is also a functor in both formats
(M, p) -> X
to compute the gradient in allocating fashion.(M, X, p)
to compute the gradient in in-place fashion.
additionally this gradient does accept a positional last argument to specify the range
for the internal gradient call of the constrained objective.
based on the internal ConstrainedManifoldObjective
and computes the gradient $(_tex(:grad))$(_tex(:Cal, "L"))_{ρ}(p, μ, λ)
, see also [
AugmentedLagrangianCost`](@ref).
Fields
co::CO
,ρ::R
,μ::T
,λ::T
as mentioned in the formula, where $R$ should be the
number type used and $T$ the vector type.
Constructor
AugmentedLagrangianGrad(co, ρ, μ, λ)
Technical details
The augmented_Lagrangian_method
solver requires the following functions of a manifold to be available
- A
copyto!
(M, q, p)
andcopy
(M,p)
for points. - Everything the subsolver requires, which by default is the
quasi_Newton
method - A
zero_vector
(M,p)
.
Literature
- [LB19]
- C. Liu and N. Boumal. Simple algorithms for optimization on Riemannian manifolds with constraints. Applied Mathematics & Optimization (2019), arXiv:1091.10000.