Exact penalty method
Manopt.exact_penalty_method
— Functionexact_penalty_method(M, f, grad_f, p=rand(M); kwargs...)
exact_penalty_method(M, cmo::ConstrainedManifoldObjective, p=rand(M); kwargs...)
exact_penalty_method!(M, f, grad_f, p; kwargs...)
exact_penalty_method!(M, cmo::ConstrainedManifoldObjective, p; kwargs...)
perform the exact penalty method (EPM) [LB19] The aim of the EPM is to find a solution of the constrained optimisation task
\[\begin{aligned} \operatorname*{arg\,min}_{p ∈ \mathcal M} & f(p)\\ \text{subject to}\quad&g_i(p) ≤ 0 \quad \text{ for } i= 1, …, m,\\ \quad & h_j(p)=0 \quad \text{ for } j=1,…,n, \end{aligned}\]
where M
is a Riemannian manifold, and $f$, $\{g_i\}_{i=1}^{n}$ and $\{h_j\}_{j=1}^{m}$ are twice continuously differentiable functions from M
to ℝ. For that a weighted $L_1$-penalty term for the violation of the constraints is added to the objective
\[f(x) + ρ\biggl( \sum_{i=1}^m \max\bigl\{0, g_i(x)\bigr\} + \sum_{j=1}^n \vert h_j(x)\vert\biggr),\]
where $ρ>0$ is the penalty parameter.
Since this is non-smooth, a SmoothingTechnique
with parameter u
is applied, see the ExactPenaltyCost
.
In every step $k$ of the exact penalty method, the smoothed objective is then minimized over all $p ∈\mathcal M$. Then, the accuracy tolerance $ϵ$ and the smoothing parameter $u$ are updated by setting
\[ϵ^{(k)}=\max\{ϵ_{\min}, θ_ϵ ϵ^{(k-1)}\},\]
where $ϵ_{\min}$ is the lowest value $ϵ$ is allowed to become and $θ_ϵ ∈ (0,1)$ is constant scaling factor, and
\[u^{(k)} = \max \{u_{\min}, \theta_u u^{(k-1)} \},\]
where $u_{\min}$ is the lowest value $u$ is allowed to become and $θ_u ∈ (0,1)$ is constant scaling factor.
Finally, the penalty parameter $ρ$ is updated as
\[ρ^{(k)} = \begin{cases} ρ^{(k-1)}/θ_ρ, & \text{if } \displaystyle \max_{j ∈ \mathcal{E},i ∈ \mathcal{I}} \Bigl\{ \vert h_j(x^{(k)}) \vert, g_i(x^{(k)})\Bigr\} \geq u^{(k-1)} \Bigr) ,\\ ρ^{(k-1)}, & \text{else,} \end{cases}\]
where $θ_ρ ∈ (0,1)$ is a constant scaling factor.
Input
M::
AbstractManifold
: a Riemannian manifold $\mathcal M$
f
: a cost function $f: \mathcal M→ ℝ$ implemented as(M, p) -> v
grad_f
: the (Riemannian) gradient $\operatorname{grad}f: \mathcal M → T_{p}\mathcal M$ of f as a function(M, p) -> X
or a function(M, X, p) -> X
computingX
in-placep
: a point on the manifold $\mathcal M$
Keyword arguments
if not called with the ConstrainedManifoldObjective
cmo
g=nothing
: the inequality constraintsh=nothing
: the equality constraintsgrad_g=nothing
: the gradient of the inequality constraintsgrad_h=nothing
: the gradient of the equality constraints
Note that one of the pairs (g
, grad_g
) or (h
, grad_h
) has to be provided. Otherwise the problem is not constrained and a better solver would be for example quasi_Newton
.
Further keyword arguments
ϵ=1e–3
: the accuracy toleranceϵ_exponent=1/100
: exponent of the ϵ update factor;ϵ_min=1e-6
: the lower bound for the accuracy toleranceu=1e–1
: the smoothing parameter and threshold for violation of the constraintsu_exponent=1/100
: exponent of the u update factor;u_min=1e-6
: the lower bound for the smoothing parameter and threshold for violation of the constraintsρ=1.0
: the penalty parameterequality_constraints=nothing
: the number $n$ of equality constraints. If not provided, a call to the gradient ofg
is performed to estimate these.gradient_range=nothing
: specify how both gradients of the constraints are representedgradient_equality_range=gradient_range
: specify how gradients of the equality constraints are represented, seeVectorGradientFunction
.gradient_inequality_range=gradient_range
: specify how gradients of the inequality constraints are represented, seeVectorGradientFunction
.inequality_constraints=nothing
: the number $m$ of inequality constraints. If not provided, a call to the gradient ofg
is performed to estimate these.min_stepsize=1e-10
: the minimal step sizesmoothing=
LogarithmicSumOfExponentials
: aSmoothingTechnique
to usesub_cost=
ExactPenaltyCost
(problem, ρ, u; smoothing=smoothing)
: cost to use in the sub solver This is used to define thesub_problem=
keyword and has hence no effect, if you setsub_problem
directly.sub_grad=
ExactPenaltyGrad
(problem, ρ, u; smoothing=smoothing)
: gradient to use in the sub solver This is used to define thesub_problem=
keyword and has hence no effect, if you setsub_problem
directly.sub_kwargs=
(;)
: a named tuple of keyword arguments that are passed todecorate_objective!
of the sub solvers objective, thedecorate_state!
of the subsovlers state, and the sub state constructor itself.
sub_stopping_criterion=
StopAfterIteration
(200)
|
StopWhenGradientNormLess
(ϵ)
|
StopWhenStepsizeLess
(1e-10)
: a stopping cirterion for the sub solver This is used to define thesub_state=
keyword and has hence no effect, if you setsub_state
directly.sub_state=
DefaultManoptProblem
(M,
ManifoldGradientObjective
`(subcost, subgrad; evaluation=evaluation): a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.sub_state=
QuasiNewtonState
: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function. whereQuasiNewtonLimitedMemoryDirectionUpdate
withInverseBFGS
is usedstopping_criterion=
StopAfterIteration
(300)
|
(
StopWhenSmallerOrEqual
(ϵ, ϵ_min)
&
StopWhenChangeLess
(1e-10) )
: a functor indicating that the stopping criterion is fulfilled
For the range
s of the constraints' gradient, other power manifold tangent space representations, mainly the ArrayPowerRepresentation
can be used if the gradients can be computed more efficiently in that representation.
All other keyword arguments are passed to decorate_state!
for state decorators or decorate_objective!
for objective decorators, respectively.
Output
The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return
for details, especially the return_state=
keyword.
Manopt.exact_penalty_method!
— Functionexact_penalty_method(M, f, grad_f, p=rand(M); kwargs...)
exact_penalty_method(M, cmo::ConstrainedManifoldObjective, p=rand(M); kwargs...)
exact_penalty_method!(M, f, grad_f, p; kwargs...)
exact_penalty_method!(M, cmo::ConstrainedManifoldObjective, p; kwargs...)
perform the exact penalty method (EPM) [LB19] The aim of the EPM is to find a solution of the constrained optimisation task
\[\begin{aligned} \operatorname*{arg\,min}_{p ∈ \mathcal M} & f(p)\\ \text{subject to}\quad&g_i(p) ≤ 0 \quad \text{ for } i= 1, …, m,\\ \quad & h_j(p)=0 \quad \text{ for } j=1,…,n, \end{aligned}\]
where M
is a Riemannian manifold, and $f$, $\{g_i\}_{i=1}^{n}$ and $\{h_j\}_{j=1}^{m}$ are twice continuously differentiable functions from M
to ℝ. For that a weighted $L_1$-penalty term for the violation of the constraints is added to the objective
\[f(x) + ρ\biggl( \sum_{i=1}^m \max\bigl\{0, g_i(x)\bigr\} + \sum_{j=1}^n \vert h_j(x)\vert\biggr),\]
where $ρ>0$ is the penalty parameter.
Since this is non-smooth, a SmoothingTechnique
with parameter u
is applied, see the ExactPenaltyCost
.
In every step $k$ of the exact penalty method, the smoothed objective is then minimized over all $p ∈\mathcal M$. Then, the accuracy tolerance $ϵ$ and the smoothing parameter $u$ are updated by setting
\[ϵ^{(k)}=\max\{ϵ_{\min}, θ_ϵ ϵ^{(k-1)}\},\]
where $ϵ_{\min}$ is the lowest value $ϵ$ is allowed to become and $θ_ϵ ∈ (0,1)$ is constant scaling factor, and
\[u^{(k)} = \max \{u_{\min}, \theta_u u^{(k-1)} \},\]
where $u_{\min}$ is the lowest value $u$ is allowed to become and $θ_u ∈ (0,1)$ is constant scaling factor.
Finally, the penalty parameter $ρ$ is updated as
\[ρ^{(k)} = \begin{cases} ρ^{(k-1)}/θ_ρ, & \text{if } \displaystyle \max_{j ∈ \mathcal{E},i ∈ \mathcal{I}} \Bigl\{ \vert h_j(x^{(k)}) \vert, g_i(x^{(k)})\Bigr\} \geq u^{(k-1)} \Bigr) ,\\ ρ^{(k-1)}, & \text{else,} \end{cases}\]
where $θ_ρ ∈ (0,1)$ is a constant scaling factor.
Input
M::
AbstractManifold
: a Riemannian manifold $\mathcal M$
f
: a cost function $f: \mathcal M→ ℝ$ implemented as(M, p) -> v
grad_f
: the (Riemannian) gradient $\operatorname{grad}f: \mathcal M → T_{p}\mathcal M$ of f as a function(M, p) -> X
or a function(M, X, p) -> X
computingX
in-placep
: a point on the manifold $\mathcal M$
Keyword arguments
if not called with the ConstrainedManifoldObjective
cmo
g=nothing
: the inequality constraintsh=nothing
: the equality constraintsgrad_g=nothing
: the gradient of the inequality constraintsgrad_h=nothing
: the gradient of the equality constraints
Note that one of the pairs (g
, grad_g
) or (h
, grad_h
) has to be provided. Otherwise the problem is not constrained and a better solver would be for example quasi_Newton
.
Further keyword arguments
ϵ=1e–3
: the accuracy toleranceϵ_exponent=1/100
: exponent of the ϵ update factor;ϵ_min=1e-6
: the lower bound for the accuracy toleranceu=1e–1
: the smoothing parameter and threshold for violation of the constraintsu_exponent=1/100
: exponent of the u update factor;u_min=1e-6
: the lower bound for the smoothing parameter and threshold for violation of the constraintsρ=1.0
: the penalty parameterequality_constraints=nothing
: the number $n$ of equality constraints. If not provided, a call to the gradient ofg
is performed to estimate these.gradient_range=nothing
: specify how both gradients of the constraints are representedgradient_equality_range=gradient_range
: specify how gradients of the equality constraints are represented, seeVectorGradientFunction
.gradient_inequality_range=gradient_range
: specify how gradients of the inequality constraints are represented, seeVectorGradientFunction
.inequality_constraints=nothing
: the number $m$ of inequality constraints. If not provided, a call to the gradient ofg
is performed to estimate these.min_stepsize=1e-10
: the minimal step sizesmoothing=
LogarithmicSumOfExponentials
: aSmoothingTechnique
to usesub_cost=
ExactPenaltyCost
(problem, ρ, u; smoothing=smoothing)
: cost to use in the sub solver This is used to define thesub_problem=
keyword and has hence no effect, if you setsub_problem
directly.sub_grad=
ExactPenaltyGrad
(problem, ρ, u; smoothing=smoothing)
: gradient to use in the sub solver This is used to define thesub_problem=
keyword and has hence no effect, if you setsub_problem
directly.sub_kwargs=
(;)
: a named tuple of keyword arguments that are passed todecorate_objective!
of the sub solvers objective, thedecorate_state!
of the subsovlers state, and the sub state constructor itself.
sub_stopping_criterion=
StopAfterIteration
(200)
|
StopWhenGradientNormLess
(ϵ)
|
StopWhenStepsizeLess
(1e-10)
: a stopping cirterion for the sub solver This is used to define thesub_state=
keyword and has hence no effect, if you setsub_state
directly.sub_state=
DefaultManoptProblem
(M,
ManifoldGradientObjective
`(subcost, subgrad; evaluation=evaluation): a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.sub_state=
QuasiNewtonState
: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function. whereQuasiNewtonLimitedMemoryDirectionUpdate
withInverseBFGS
is usedstopping_criterion=
StopAfterIteration
(300)
|
(
StopWhenSmallerOrEqual
(ϵ, ϵ_min)
&
StopWhenChangeLess
(1e-10) )
: a functor indicating that the stopping criterion is fulfilled
For the range
s of the constraints' gradient, other power manifold tangent space representations, mainly the ArrayPowerRepresentation
can be used if the gradients can be computed more efficiently in that representation.
All other keyword arguments are passed to decorate_state!
for state decorators or decorate_objective!
for objective decorators, respectively.
Output
The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return
for details, especially the return_state=
keyword.
State
Manopt.ExactPenaltyMethodState
— TypeExactPenaltyMethodState{P,T} <: AbstractManoptSolverState
Describes the exact penalty method, with
Fields
ϵ
: the accuracy toleranceϵ_min
: the lower bound for the accuracy tolerancep::P
: a point on the manifold $\mathcal M$storing the current iterateρ
: the penalty parametersub_problem::Union{AbstractManoptProblem, F}
: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.sub_state::Union{AbstractManoptProblem, F}
: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.stop::StoppingCriterion
: a functor indicating that the stopping criterion is fulfilledu
: the smoothing parameter and threshold for violation of the constraintsu_min
: the lower bound for the smoothing parameter and threshold for violation of the constraintsθ_ϵ
: the scaling factor of the tolerance parameterθ_ρ
: the scaling factor of the penalty parameterθ_u
: the scaling factor of the smoothing parameter
Constructor
ExactPenaltyMethodState(M::AbstractManifold, sub_problem, sub_state; kwargs...)
construct the exact penalty state.
ExactPenaltyMethodState(M::AbstractManifold, sub_problem;
evaluation=AllocatingEvaluation(), kwargs...
)
construct the exact penalty state, where sub_problem
is a closed form solution with evaluation
as type of evaluation.
Keyword arguments
ϵ=1e-3
ϵ_min=1e-6
ϵ_exponent=1 / 100
: a shortcut for the scaling factor $θ_ϵ$θ_ϵ=(ϵ_min / ϵ)^(ϵ_exponent)
u=1e-1
u_min=1e-6
u_exponent=1 / 100
: a shortcut for the scaling factor $θ_u$.θ_u=(u_min / u)^(u_exponent)
p=
rand
(M)
: a point on the manifold $\mathcal M$to specify the initial valueρ=1.0
θ_ρ=0.3
stopping_criterion=
StopAfterIteration
(300)
|
(
: a functor indicating that the stopping criterion is fulfilledStopWhenSmallerOrEqual
(:ϵ, ϵ_min)
|
StopWhenChangeLess
(1e-10) )
See also
Helping functions
Manopt.ExactPenaltyCost
— TypeExactPenaltyCost{S, Pr, R}
Represent the cost of the exact penalty method based on a ConstrainedManifoldObjective
P
and a parameter $ρ$ given by
\[f(p) + ρ\Bigl( \sum_{i=0}^m \max\{0,g_i(p)\} + \sum_{j=0}^n \lvert h_j(p)\rvert \Bigr),\]
where an additional parameter $u$ is used as well as a smoothing technique, for example LogarithmicSumOfExponentials
or LinearQuadraticHuber
to obtain a smooth cost function. This struct is also a functor (M,p) -> v
of the cost $v$.
Fields
ρ
,u
: as described in the mathematical formula, .co
: the original cost
Constructor
ExactPenaltyCost(co::ConstrainedManifoldObjective, ρ, u; smoothing=LinearQuadraticHuber())
Manopt.ExactPenaltyGrad
— TypeExactPenaltyGrad{S, CO, R}
Represent the gradient of the ExactPenaltyCost
based on a ConstrainedManifoldObjective
co
and a parameter $ρ$ and a smoothing technique, which uses an additional parameter $u$.
This struct is also a functor in both formats
(M, p) -> X
to compute the gradient in allocating fashion.(M, X, p)
to compute the gradient in in-place fashion.
Fields
ρ
,u
as stated beforeco
the nonsmooth objective
Constructor
ExactPenaltyGradient(co::ConstrainedManifoldObjective, ρ, u; smoothing=LinearQuadraticHuber())
Manopt.SmoothingTechnique
— Typeabstract type SmoothingTechnique
Specify a smoothing technique, see for example ExactPenaltyCost
and ExactPenaltyGrad
.
Manopt.LinearQuadraticHuber
— TypeLinearQuadraticHuber <: SmoothingTechnique
Specify a smoothing based on $\max\{0,x\} ≈ \mathcal P(x,u)$ for some $u$, where
\[\mathcal P(x, u) = \begin{cases} 0 & \text{ if } x \leq 0,\\ \frac{x^2}{2u} & \text{ if } 0 \leq x \leq u,\\ x-\frac{u}{2} & \text{ if } x \geq u. \end{cases}\]
Manopt.LogarithmicSumOfExponentials
— TypeLogarithmicSumOfExponentials <: SmoothingTechnique
Specify a smoothing based on $\max\{a,b\} ≈ u \log(\mathrm{e}^{\frac{a}{u}}+\mathrm{e}^{\frac{b}{u}})$ for some $u$.
Technical details
The exact_penalty_method
solver requires the following functions of a manifold to be available
- A
copyto!
(M, q, p)
andcopy
(M,p)
for points. - Everything the subsolver requires, which by default is the
quasi_Newton
method - A
zero_vector
(M,p)
.
The stopping criteria involves StopWhenChangeLess
and StopWhenGradientNormLess
which require
- An
inverse_retract!
(M, X, p, q)
; it is recommended to set thedefault_inverse_retraction_method
to a favourite retraction. If this default is set, ainverse_retraction_method=
orinverse_retraction_method_dual=
(for $\mathcal N$) does not have to be specified or thedistance
(M, p, q)
for said default inverse retraction. - the
norm
as well, to stop when the norm of the gradient is small, but if you implementedinner
, the norm is provided already.
Literature
- [LB19]
- C. Liu and N. Boumal. Simple algorithms for optimization on Riemannian manifolds with constraints. Applied Mathematics & Optimization (2019), arXiv:1091.10000.