Augmented Lagrangian method

Manopt.augmented_Lagrangian_methodFunction
augmented_Lagrangian_method(M, f, grad_f, p=rand(M); kwargs...)
augmented_Lagrangian_method(M, cmo::ConstrainedManifoldObjective, p=rand(M); kwargs...)
augmented_Lagrangian_method!(M, f, grad_f, p; kwargs...)
augmented_Lagrangian_method!(M, cmo::ConstrainedManifoldObjective, p; kwargs...)

perform the augmented Lagrangian method (ALM) [LB19]. This method can work in-place of p.

The aim of the ALM is to find the solution of the constrained optimisation task

\[\begin{aligned} \min_{p ∈ \mathcal M} & f(p)\\ \text{subject to}\quad&g_i(p) ≤ 0 \quad \text{ for } i= 1, …, m,\\ \quad & h_j(p)=0 \quad \text{ for } j=1,…,n, \end{aligned}\]

where M is a Riemannian manifold, and $f$, $\{g_i\}_{i=1}^{n}$ and $\{h_j\}_{j=1}^{m}$ are twice continuously differentiable functions from M to ℝ. In every step $k$ of the algorithm, the AugmentedLagrangianCost $\mathcal L_{ρ^{(k)}}(p, μ^{(k)}, λ^{(k)})$ is minimized on \mathcal M, where $μ^{(k)} ∈ ℝ^n$ and $λ^{(k)} ∈ ℝ^m$ are the current iterates of the Lagrange multipliers and $ρ^{(k)}$ is the current penalty parameter.

The Lagrange multipliers are then updated by

\[λ_j^{(k+1)} =\operatorname{clip}_{[λ_{\min},λ_{\max}]} (λ_j^{(k)} + ρ^{(k)} h_j(p^{(k+1)})) \text{for all} j=1,…,p,\]

and

\[μ_i^{(k+1)} =\operatorname{clip}_{[0,μ_{\max}]} (μ_i^{(k)} + ρ^{(k)} g_i(p^{(k+1)})) \text{ for all } i=1,…,m,\]

where $λ_{\text{min}} ≤ λ_{\text{max}}$ and $μ_{\text{max}}$ are the multiplier boundaries.

Next, the accuracy tolerance $ϵ$ is updated as

\[ϵ^{(k)}=\max\{ϵ_{\min}, θ_ϵ ϵ^{(k-1)}\},\]

where $ϵ_{\text{min}}$ is the lowest value $ϵ$ is allowed to become and $θ_ϵ ∈ (0,1)$ is constant scaling factor.

Last, the penalty parameter $ρ$ is updated as follows: with

\[σ^{(k)}=\max_{j=1,…,p, i=1,…,m} \{\|h_j(p^{(k)})\|, \|\max_{i=1,…,m}\{g_i(p^{(k)}), -\frac{μ_i^{(k-1)}}{ρ^{(k-1)}} \}\| \}.\]

ρ is updated as

\[ρ^{(k)} = \begin{cases} ρ^{(k-1)}/θ_ρ, & \text{if } σ^{(k)}\leq θ_ρ σ^{(k-1)} ,\\ ρ^{(k-1)}, & \text{else,} \end{cases}\]

where $θ_ρ ∈ (0,1)$ is a constant scaling factor.

Input

  • M::AbstractManifold: a Riemannian manifold $\mathcal M$
  • f: a cost function $f: \mathcal M→ ℝ$ implemented as (M, p) -> v
  • grad_f: the (Riemannian) gradient $\operatorname{grad}f$: \mathcal M → T_{p}\mathcal M of f as a function (M, p) -> X or a function (M, X, p) -> X computing X in-place

Optional (if not called with the ConstrainedManifoldObjective cmo)

  • g=nothing: the inequality constraints
  • h=nothing: the equality constraints
  • grad_g=nothing: the gradient of the inequality constraints
  • grad_h=nothing: the gradient of the equality constraints

Note that one of the pairs (g, grad_g) or (h, grad_h) has to be provided. Otherwise the problem is not constrained and a better solver would be for example quasi_Newton.

Keyword Arguments

  • evaluation=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.

  • ϵ=1e-3: the accuracy tolerance

  • ϵ_min=1e-6: the lower bound for the accuracy tolerance

  • ϵ_exponent=1/100: exponent of the ϵ update factor; also 1/number of iterations until maximal accuracy is needed to end algorithm naturally

    • equality_constraints=nothing: the number $n$ of equality constraints.

    If not provided, a call to the gradient of g is performed to estimate these.

  • gradient_range=nothing: specify how both gradients of the constraints are represented

  • gradient_equality_range=gradient_range: specify how gradients of the equality constraints are represented, see VectorGradientFunction.

  • gradient_inequality_range=gradient_range: specify how gradients of the inequality constraints are represented, see VectorGradientFunction.

  • inequality_constraints=nothing: the number $m$ of inequality constraints. If not provided, a call to the gradient of g is performed to estimate these.

  • λ=ones(size(h(M,x),1)): the Lagrange multiplier with respect to the equality constraints

  • λ_max=20.0: an upper bound for the Lagrange multiplier belonging to the equality constraints

  • λ_min=- λ_max: a lower bound for the Lagrange multiplier belonging to the equality constraints

  • μ=ones(size(h(M,x),1)): the Lagrange multiplier with respect to the inequality constraints

  • μ_max=20.0: an upper bound for the Lagrange multiplier belonging to the inequality constraints

  • ρ=1.0: the penalty parameter

  • τ=0.8: factor for the improvement of the evaluation of the penalty parameter

  • θ_ρ=0.3: the scaling factor of the penalty parameter

  • θ_ϵ=(ϵ_min / ϵ)^(ϵ_exponent): the scaling factor of the exactness

  • sub_cost=[AugmentedLagrangianCost± (@ref)(cmo, ρ, μ, λ): use augmented Lagrangian cost, based on the ConstrainedManifoldObjective build from the functions provided. This is used to define the sub_problem= keyword and has hence no effect, if you set sub_problem directly.

  • sub_grad=[AugmentedLagrangianGrad](@ref)(cmo, ρ, μ, λ): use augmented Lagrangian gradient, based on the [ConstrainedManifoldObjective](@ref) build from the functions provided. This is used to define thesubproblem=keyword and has hence no effect, if you setsubproblem` directly.

  • sub_kwargs=(;): a named tuple of keyword arguments that are passed to decorate_objective! of the sub solvers objective, the decorate_state! of the subsovlers state, and the sub state constructor itself.

  • stopping_criterion=StopAfterIteration(300)|(`StopWhenSmallerOrEqual(:ϵ, ϵ_min)&StopWhenChangeLess(1e-10) )[ | ](@ref StopWhenAny)[StopWhenChangeLess](@ref): a functor indicating that the stopping criterion is fulfilled

  • sub_problem=DefaultManoptProblem(M, sub_objective): specify a problem for a solver or a closed form solution function, which can be allocating or in-place.

  • sub_state=QuasiNewtonState: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.as the quasi newton method, the QuasiNewtonLimitedMemoryDirectionUpdate with InverseBFGS is used.

  • `substoppingcriterion::StoppingCriterion=StopAfterIteration(300) | StopWhenGradientNormLess(ϵ) | StopWhenStepsizeLess(1e-8),

For the ranges of the constraints' gradient, other power manifold tangent space representations, mainly the ArrayPowerRepresentation can be used if the gradients can be computed more efficiently in that representation.

All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective decorators, respectively.

Output

The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.

source
Manopt.augmented_Lagrangian_method!Function
augmented_Lagrangian_method(M, f, grad_f, p=rand(M); kwargs...)
augmented_Lagrangian_method(M, cmo::ConstrainedManifoldObjective, p=rand(M); kwargs...)
augmented_Lagrangian_method!(M, f, grad_f, p; kwargs...)
augmented_Lagrangian_method!(M, cmo::ConstrainedManifoldObjective, p; kwargs...)

perform the augmented Lagrangian method (ALM) [LB19]. This method can work in-place of p.

The aim of the ALM is to find the solution of the constrained optimisation task

\[\begin{aligned} \min_{p ∈ \mathcal M} & f(p)\\ \text{subject to}\quad&g_i(p) ≤ 0 \quad \text{ for } i= 1, …, m,\\ \quad & h_j(p)=0 \quad \text{ for } j=1,…,n, \end{aligned}\]

where M is a Riemannian manifold, and $f$, $\{g_i\}_{i=1}^{n}$ and $\{h_j\}_{j=1}^{m}$ are twice continuously differentiable functions from M to ℝ. In every step $k$ of the algorithm, the AugmentedLagrangianCost $\mathcal L_{ρ^{(k)}}(p, μ^{(k)}, λ^{(k)})$ is minimized on \mathcal M, where $μ^{(k)} ∈ ℝ^n$ and $λ^{(k)} ∈ ℝ^m$ are the current iterates of the Lagrange multipliers and $ρ^{(k)}$ is the current penalty parameter.

The Lagrange multipliers are then updated by

\[λ_j^{(k+1)} =\operatorname{clip}_{[λ_{\min},λ_{\max}]} (λ_j^{(k)} + ρ^{(k)} h_j(p^{(k+1)})) \text{for all} j=1,…,p,\]

and

\[μ_i^{(k+1)} =\operatorname{clip}_{[0,μ_{\max}]} (μ_i^{(k)} + ρ^{(k)} g_i(p^{(k+1)})) \text{ for all } i=1,…,m,\]

where $λ_{\text{min}} ≤ λ_{\text{max}}$ and $μ_{\text{max}}$ are the multiplier boundaries.

Next, the accuracy tolerance $ϵ$ is updated as

\[ϵ^{(k)}=\max\{ϵ_{\min}, θ_ϵ ϵ^{(k-1)}\},\]

where $ϵ_{\text{min}}$ is the lowest value $ϵ$ is allowed to become and $θ_ϵ ∈ (0,1)$ is constant scaling factor.

Last, the penalty parameter $ρ$ is updated as follows: with

\[σ^{(k)}=\max_{j=1,…,p, i=1,…,m} \{\|h_j(p^{(k)})\|, \|\max_{i=1,…,m}\{g_i(p^{(k)}), -\frac{μ_i^{(k-1)}}{ρ^{(k-1)}} \}\| \}.\]

ρ is updated as

\[ρ^{(k)} = \begin{cases} ρ^{(k-1)}/θ_ρ, & \text{if } σ^{(k)}\leq θ_ρ σ^{(k-1)} ,\\ ρ^{(k-1)}, & \text{else,} \end{cases}\]

where $θ_ρ ∈ (0,1)$ is a constant scaling factor.

Input

  • M::AbstractManifold: a Riemannian manifold $\mathcal M$
  • f: a cost function $f: \mathcal M→ ℝ$ implemented as (M, p) -> v
  • grad_f: the (Riemannian) gradient $\operatorname{grad}f$: \mathcal M → T_{p}\mathcal M of f as a function (M, p) -> X or a function (M, X, p) -> X computing X in-place

Optional (if not called with the ConstrainedManifoldObjective cmo)

  • g=nothing: the inequality constraints
  • h=nothing: the equality constraints
  • grad_g=nothing: the gradient of the inequality constraints
  • grad_h=nothing: the gradient of the equality constraints

Note that one of the pairs (g, grad_g) or (h, grad_h) has to be provided. Otherwise the problem is not constrained and a better solver would be for example quasi_Newton.

Keyword Arguments

  • evaluation=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.

  • ϵ=1e-3: the accuracy tolerance

  • ϵ_min=1e-6: the lower bound for the accuracy tolerance

  • ϵ_exponent=1/100: exponent of the ϵ update factor; also 1/number of iterations until maximal accuracy is needed to end algorithm naturally

    • equality_constraints=nothing: the number $n$ of equality constraints.

    If not provided, a call to the gradient of g is performed to estimate these.

  • gradient_range=nothing: specify how both gradients of the constraints are represented

  • gradient_equality_range=gradient_range: specify how gradients of the equality constraints are represented, see VectorGradientFunction.

  • gradient_inequality_range=gradient_range: specify how gradients of the inequality constraints are represented, see VectorGradientFunction.

  • inequality_constraints=nothing: the number $m$ of inequality constraints. If not provided, a call to the gradient of g is performed to estimate these.

  • λ=ones(size(h(M,x),1)): the Lagrange multiplier with respect to the equality constraints

  • λ_max=20.0: an upper bound for the Lagrange multiplier belonging to the equality constraints

  • λ_min=- λ_max: a lower bound for the Lagrange multiplier belonging to the equality constraints

  • μ=ones(size(h(M,x),1)): the Lagrange multiplier with respect to the inequality constraints

  • μ_max=20.0: an upper bound for the Lagrange multiplier belonging to the inequality constraints

  • ρ=1.0: the penalty parameter

  • τ=0.8: factor for the improvement of the evaluation of the penalty parameter

  • θ_ρ=0.3: the scaling factor of the penalty parameter

  • θ_ϵ=(ϵ_min / ϵ)^(ϵ_exponent): the scaling factor of the exactness

  • sub_cost=[AugmentedLagrangianCost± (@ref)(cmo, ρ, μ, λ): use augmented Lagrangian cost, based on the ConstrainedManifoldObjective build from the functions provided. This is used to define the sub_problem= keyword and has hence no effect, if you set sub_problem directly.

  • sub_grad=[AugmentedLagrangianGrad](@ref)(cmo, ρ, μ, λ): use augmented Lagrangian gradient, based on the [ConstrainedManifoldObjective](@ref) build from the functions provided. This is used to define thesubproblem=keyword and has hence no effect, if you setsubproblem` directly.

  • sub_kwargs=(;): a named tuple of keyword arguments that are passed to decorate_objective! of the sub solvers objective, the decorate_state! of the subsovlers state, and the sub state constructor itself.

  • stopping_criterion=StopAfterIteration(300)|(`StopWhenSmallerOrEqual(:ϵ, ϵ_min)&StopWhenChangeLess(1e-10) )[ | ](@ref StopWhenAny)[StopWhenChangeLess](@ref): a functor indicating that the stopping criterion is fulfilled

  • sub_problem=DefaultManoptProblem(M, sub_objective): specify a problem for a solver or a closed form solution function, which can be allocating or in-place.

  • sub_state=QuasiNewtonState: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.as the quasi newton method, the QuasiNewtonLimitedMemoryDirectionUpdate with InverseBFGS is used.

  • `substoppingcriterion::StoppingCriterion=StopAfterIteration(300) | StopWhenGradientNormLess(ϵ) | StopWhenStepsizeLess(1e-8),

For the ranges of the constraints' gradient, other power manifold tangent space representations, mainly the ArrayPowerRepresentation can be used if the gradients can be computed more efficiently in that representation.

All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective decorators, respectively.

Output

The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.

source

State

Manopt.AugmentedLagrangianMethodStateType
AugmentedLagrangianMethodState{P,T} <: AbstractManoptSolverState

Describes the augmented Lagrangian method, with

Fields

a default value is given in brackets if a parameter can be left out in initialization.

  • ϵ: the accuracy tolerance
  • ϵ_min: the lower bound for the accuracy tolerance
  • λ: the Lagrange multiplier with respect to the equality constraints
  • λ_max: an upper bound for the Lagrange multiplier belonging to the equality constraints
  • λ_min: a lower bound for the Lagrange multiplier belonging to the equality constraints
  • p::P: a point on the manifold $\mathcal M$storing the current iterate
  • penalty: evaluation of the current penalty term, initialized to Inf.
  • μ: the Lagrange multiplier with respect to the inequality constraints
  • μ_max: an upper bound for the Lagrange multiplier belonging to the inequality constraints
  • ρ: the penalty parameter
  • sub_problem::Union{AbstractManoptProblem, F}: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.
  • sub_state::Union{AbstractManoptProblem, F}: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.
  • τ: factor for the improvement of the evaluation of the penalty parameter
  • θ_ρ: the scaling factor of the penalty parameter
  • θ_ϵ: the scaling factor of the accuracy tolerance
  • stop::StoppingCriterion: a functor indicating that the stopping criterion is fulfilled

Constructor

AugmentedLagrangianMethodState(M::AbstractManifold, co::ConstrainedManifoldObjective,
    sub_problem, sub_state; kwargs...
)

construct an augmented Lagrangian method options, where the manifold M and the ConstrainedManifoldObjective co are used for manifold- or objective specific defaults.

AugmentedLagrangianMethodState(M::AbstractManifold, co::ConstrainedManifoldObjective,
    sub_problem; evaluation=AllocatingEvaluation(), kwargs...
)

construct an augmented Lagrangian method options, where the manifold M and the ConstrainedManifoldObjective co are used for manifold- or objective specific defaults, and sub_problem is a closed form solution with evaluation as type of evaluation.

Keyword arguments

the following keyword arguments are available to initialise the corresponding fields

See also

augmented_Lagrangian_method

source

Helping functions

Manopt.AugmentedLagrangianCostType
AugmentedLagrangianCost{CO,R,T}

Stores the parameters $ρ ∈ ℝ$, $μ ∈ ℝ^m$, $λ ∈ ℝ^n$ of the augmented Lagrangian associated to the ConstrainedManifoldObjective co.

This struct is also a functor (M,p) -> v that can be used as a cost function within a solver, based on the internal ConstrainedManifoldObjective it computes

\[\mathcal L_\rho(p, μ, λ) = f(x) + \frac{ρ}{2} \biggl( \sum_{j=1}^n \Bigl( h_j(p) + \frac{λ_j}{ρ} \Bigr)^2 + \sum_{i=1}^m \max\Bigl\{ 0, \frac{μ_i}{ρ} + g_i(p) \Bigr\}^2 \Bigr)\]

Fields

  • co::CO, ρ::R, μ::T, λ::T as mentioned in the formula, where $R$ should be the

number type used and $T$ the vector type.

Constructor

AugmentedLagrangianCost(co, ρ, μ, λ)
source
Manopt.AugmentedLagrangianGradType
AugmentedLagrangianGrad{CO,R,T} <: AbstractConstrainedFunctor{T}

Stores the parameters $ρ ∈ ℝ$, $μ ∈ ℝ^m$, $λ ∈ ℝ^n$ of the augmented Lagrangian associated to the ConstrainedManifoldObjective co.

This struct is also a functor in both formats

  • (M, p) -> X to compute the gradient in allocating fashion.
  • (M, X, p) to compute the gradient in in-place fashion.

additionally this gradient does accept a positional last argument to specify the range for the internal gradient call of the constrained objective.

based on the internal ConstrainedManifoldObjective and computes the gradient $(_tex(:grad))$(_tex(:Cal, "L"))_{ρ}(p, μ, λ), see also [AugmentedLagrangianCost`](@ref).

Fields

  • co::CO, ρ::R, μ::T, λ::T as mentioned in the formula, where $R$ should be the

number type used and $T$ the vector type.

Constructor

AugmentedLagrangianGrad(co, ρ, μ, λ)
source

Technical details

The augmented_Lagrangian_method solver requires the following functions of a manifold to be available

Literature

[LB19]
C. Liu and N. Boumal. Simple algorithms for optimization on Riemannian manifolds with constraints. Applied Mathematics & Optimization (2019), arXiv:1091.10000.