Interior point Newton method
Manopt.interior_point_Newton
— Functioninterior_point_Newton(M, f, grad_f, Hess_f, p=rand(M); kwargs...)
interior_point_Newton(M, cmo::ConstrainedManifoldObjective, p=rand(M); kwargs...)
interior_point_Newton!(M, f, grad]_f, Hess_f, p; kwargs...)
interior_point_Newton(M, ConstrainedManifoldObjective, p; kwargs...)
perform the interior point Newton method following [LY24].
In order to solve the constrained problem
\[\begin{aligned} \min_{p ∈ \mathcal M} & f(p)\\ \text{subject to} &g_i(p) ≤ 0 \quad \text{ for } i= 1, …, m,\\ \quad & h_j(p)=0 \quad \text{ for } j=1,…,n, \end{aligned}\]
This algorithms iteratively solves the linear system based on extending the KKT system by a slack variable s
.
\[\operatorname{J} F(p, μ, λ, s)[X, Y, Z, W] = -F(p, μ, λ, s), \text{ where } X ∈ T_{p}\mathcal M, Y,W ∈ ℝ^m, Z ∈ ℝ^n,\]
see CondensedKKTVectorFieldJacobian
and CondensedKKTVectorField
, respectively, for the reduced form, this is usually solved in. From the resulting X
and Z
in the reeuced form, the other two, $Y$, $W$, are then computed.
From the gradient $(X,Y,Z,W)$ at the current iterate $(p, μ, λ, s)$, a line search is performed using the KKTVectorFieldNormSq
norm of the KKT vector field (squared) and its gradient KKTVectorFieldNormSqGradient
together with the InteriorPointCentralityCondition
.
Note that since the vector field $F$ includes the gradients of the constraint functions $g, h$, its gradient or Jacobian requires the Hessians of the constraints.
For that seach direction a line search is performed, that additionally ensures that the constraints are further fulfilled.
Input
M
: a Riemannian manifold $\mathcal M$f
: a cost function $f: \mathcal M→ ℝ$ implemented as(M, p) -> v
grad_f
: the (Riemannian) gradient $\operatorname{grad}f$: \mathcal M → T_{p}\mathcal M of f as a function(M, p) -> X
or a function(M, X, p) -> X
computingX
in-placeHess_f
: the (Riemannian) Hessian $\operatorname{Hess}f$: T{p}\mathcal M → T{p}\mathcal M of f as a function(M, p, X) -> Y
or a function(M, Y, p, X) -> Y
computingY
in-placep
: a point on the manifold $\mathcal M$
or a ConstrainedManifoldObjective
cmo
containing f
, grad_f
, Hess_f
, and the constraints
Keyword arguments
The keyword arguments related to the constraints (the first eleven) are ignored if you pass a ConstrainedManifoldObjective
cmo
centrality_condition=missing
; an additional condition when to accept a step size. This can be used to ensure that the resulting iterate is still an interior point if you provide a check(N,q) -> true/false
, whereN
is the manifold of thestep_problem
.equality_constraints=nothing
: the number $n$ of equality constraints.evaluation=
AllocatingEvaluation
()
: specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation
) or whether they modify their input argument to return the result therein (InplaceEvaluation
). Since usually the first argument is the manifold, the modified argument is the second.g=nothing
: the inequality constraintsgrad_g=nothing
: the gradient of the inequality constraintsgrad_h=nothing
: the gradient of the equality constraintsgradient_range=nothing
: specify how gradients are represented, wherenothing
is equivalent toNestedPowerRepresentation
gradient_equality_range=gradient_range
: specify how the gradients of the equality constraints are representedgradient_inequality_range=gradient_range
: specify how the gradients of the inequality constraints are representedh=nothing
: the equality constraintsHess_g=nothing
: the Hessian of the inequality constraintsHess_h=nothing
: the Hessian of the equality constraintsinequality_constraints=nothing
: the number $m$ of inequality constraints.λ=ones(length(h(M, p)))
: the Lagrange multiplier with respect to the equality constraints $h$μ=ones(length(g(M, p)))
: the Lagrange multiplier with respect to the inequality constraints $g$retraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsρ=μ's / length(μ)
: store the orthogonalityμ's/m
to compute the barrier parameterβ
in the sub problem.s=copy(μ)
: initial value for the slack variablesσ=
calculate_σ
(M, cmo, p, μ, λ, s)
: scaling factor for the barrier parameterβ
in the sub problem, which is updated during the iterationsstep_objective
: aManifoldGradientObjective
of the norm of the KKT vector fieldKKTVectorFieldNormSq
and its gradientKKTVectorFieldNormSqGradient
step_problem
: the manifold $\mathcal M × ℝ^m × ℝ^n × ℝ^m$ together with thestep_objective
as the problem the linesearchstepsize=
employs for determining a step sizestep_state
: theStepsizeState
with point and search directionstepsize=
ArmijoLinesearch
()
: a functor inheriting fromStepsize
to determine a step size with thecentrality_condtion
keyword as additional criterion to accept a step, if this is providedstopping_criterion=
StopAfterIteration
(200)
|
StopWhenKKTResidualLess
(1e-8)
: a functor indicating that the stopping criterion is fulfilled a stopping criterion, by default depending on the residual of the KKT vector field or a maximal number of steps, which ever hits first.sub_kwargs=(;)
: keyword arguments to decorate the sub options, for example debug, that automatically respects the main solvers debug options (like sub-sampling) as wellsub_objective
: TheSymmetricLinearSystemObjective
modelling the system of equations to use in the sub solver, includes theCondensedKKTVectorFieldJacobian
$\mathcal A(X)$ and theCondensedKKTVectorField
$b$ in $\mathcal A(X) + b = 0$ we aim to solve. This is used to define thesub_problem=
keyword and has hence no effect, if you setsub_problem
directly.sub_stopping_criterion=
StopAfterIteration
(manifold_dimension(M))
|
StopWhenRelativeResidualLess
(c,1e-8)
, where $c = \lVert b \rVert_{}$ from the system to solve. This is used to define thesub_state=
keyword and has hence no effect, if you setsub_state
directly.sub_problem=
DefaultManoptProblem
(M, sub_objective)
: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.sub_state=
ConjugateResidualState
: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.vector_space=
Rn
a function that, given an integer, returns the manifold to be used for the vector space components $ℝ^m,ℝ^n$X=
zero_vector
(M,p)
: th initial gradient with respect top
.Y=zero(μ)
: the initial gradient with respct toμ
Z=zero(λ)
: the initial gradient with respct toλ
W=zero(s)
: the initial gradient with respct tos
As well as internal keywords used to set up these given keywords like _step_M
, _step_p
, _sub_M
, _sub_p
, and _sub_X
, that should not be changed.
All other keyword arguments are passed to decorate_state!
for state decorators or decorate_objective!
for objective, respectively.
The centrality_condition=mising
disables to check centrality during the line search, but you can pass InteriorPointCentralityCondition
(cmo, γ)
, where γ
is a constant, to activate this check.
Output
The obtained approximate constrained minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return
for details, especially the return_state=
keyword.
Manopt.interior_point_Newton!
— Functioninterior_point_Newton(M, f, grad_f, Hess_f, p=rand(M); kwargs...)
interior_point_Newton(M, cmo::ConstrainedManifoldObjective, p=rand(M); kwargs...)
interior_point_Newton!(M, f, grad]_f, Hess_f, p; kwargs...)
interior_point_Newton(M, ConstrainedManifoldObjective, p; kwargs...)
perform the interior point Newton method following [LY24].
In order to solve the constrained problem
\[\begin{aligned} \min_{p ∈ \mathcal M} & f(p)\\ \text{subject to} &g_i(p) ≤ 0 \quad \text{ for } i= 1, …, m,\\ \quad & h_j(p)=0 \quad \text{ for } j=1,…,n, \end{aligned}\]
This algorithms iteratively solves the linear system based on extending the KKT system by a slack variable s
.
\[\operatorname{J} F(p, μ, λ, s)[X, Y, Z, W] = -F(p, μ, λ, s), \text{ where } X ∈ T_{p}\mathcal M, Y,W ∈ ℝ^m, Z ∈ ℝ^n,\]
see CondensedKKTVectorFieldJacobian
and CondensedKKTVectorField
, respectively, for the reduced form, this is usually solved in. From the resulting X
and Z
in the reeuced form, the other two, $Y$, $W$, are then computed.
From the gradient $(X,Y,Z,W)$ at the current iterate $(p, μ, λ, s)$, a line search is performed using the KKTVectorFieldNormSq
norm of the KKT vector field (squared) and its gradient KKTVectorFieldNormSqGradient
together with the InteriorPointCentralityCondition
.
Note that since the vector field $F$ includes the gradients of the constraint functions $g, h$, its gradient or Jacobian requires the Hessians of the constraints.
For that seach direction a line search is performed, that additionally ensures that the constraints are further fulfilled.
Input
M
: a Riemannian manifold $\mathcal M$f
: a cost function $f: \mathcal M→ ℝ$ implemented as(M, p) -> v
grad_f
: the (Riemannian) gradient $\operatorname{grad}f$: \mathcal M → T_{p}\mathcal M of f as a function(M, p) -> X
or a function(M, X, p) -> X
computingX
in-placeHess_f
: the (Riemannian) Hessian $\operatorname{Hess}f$: T{p}\mathcal M → T{p}\mathcal M of f as a function(M, p, X) -> Y
or a function(M, Y, p, X) -> Y
computingY
in-placep
: a point on the manifold $\mathcal M$
or a ConstrainedManifoldObjective
cmo
containing f
, grad_f
, Hess_f
, and the constraints
Keyword arguments
The keyword arguments related to the constraints (the first eleven) are ignored if you pass a ConstrainedManifoldObjective
cmo
centrality_condition=missing
; an additional condition when to accept a step size. This can be used to ensure that the resulting iterate is still an interior point if you provide a check(N,q) -> true/false
, whereN
is the manifold of thestep_problem
.equality_constraints=nothing
: the number $n$ of equality constraints.evaluation=
AllocatingEvaluation
()
: specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation
) or whether they modify their input argument to return the result therein (InplaceEvaluation
). Since usually the first argument is the manifold, the modified argument is the second.g=nothing
: the inequality constraintsgrad_g=nothing
: the gradient of the inequality constraintsgrad_h=nothing
: the gradient of the equality constraintsgradient_range=nothing
: specify how gradients are represented, wherenothing
is equivalent toNestedPowerRepresentation
gradient_equality_range=gradient_range
: specify how the gradients of the equality constraints are representedgradient_inequality_range=gradient_range
: specify how the gradients of the inequality constraints are representedh=nothing
: the equality constraintsHess_g=nothing
: the Hessian of the inequality constraintsHess_h=nothing
: the Hessian of the equality constraintsinequality_constraints=nothing
: the number $m$ of inequality constraints.λ=ones(length(h(M, p)))
: the Lagrange multiplier with respect to the equality constraints $h$μ=ones(length(g(M, p)))
: the Lagrange multiplier with respect to the inequality constraints $g$retraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsρ=μ's / length(μ)
: store the orthogonalityμ's/m
to compute the barrier parameterβ
in the sub problem.s=copy(μ)
: initial value for the slack variablesσ=
calculate_σ
(M, cmo, p, μ, λ, s)
: scaling factor for the barrier parameterβ
in the sub problem, which is updated during the iterationsstep_objective
: aManifoldGradientObjective
of the norm of the KKT vector fieldKKTVectorFieldNormSq
and its gradientKKTVectorFieldNormSqGradient
step_problem
: the manifold $\mathcal M × ℝ^m × ℝ^n × ℝ^m$ together with thestep_objective
as the problem the linesearchstepsize=
employs for determining a step sizestep_state
: theStepsizeState
with point and search directionstepsize=
ArmijoLinesearch
()
: a functor inheriting fromStepsize
to determine a step size with thecentrality_condtion
keyword as additional criterion to accept a step, if this is providedstopping_criterion=
StopAfterIteration
(200)
|
StopWhenKKTResidualLess
(1e-8)
: a functor indicating that the stopping criterion is fulfilled a stopping criterion, by default depending on the residual of the KKT vector field or a maximal number of steps, which ever hits first.sub_kwargs=(;)
: keyword arguments to decorate the sub options, for example debug, that automatically respects the main solvers debug options (like sub-sampling) as wellsub_objective
: TheSymmetricLinearSystemObjective
modelling the system of equations to use in the sub solver, includes theCondensedKKTVectorFieldJacobian
$\mathcal A(X)$ and theCondensedKKTVectorField
$b$ in $\mathcal A(X) + b = 0$ we aim to solve. This is used to define thesub_problem=
keyword and has hence no effect, if you setsub_problem
directly.sub_stopping_criterion=
StopAfterIteration
(manifold_dimension(M))
|
StopWhenRelativeResidualLess
(c,1e-8)
, where $c = \lVert b \rVert_{}$ from the system to solve. This is used to define thesub_state=
keyword and has hence no effect, if you setsub_state
directly.sub_problem=
DefaultManoptProblem
(M, sub_objective)
: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.sub_state=
ConjugateResidualState
: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.vector_space=
Rn
a function that, given an integer, returns the manifold to be used for the vector space components $ℝ^m,ℝ^n$X=
zero_vector
(M,p)
: th initial gradient with respect top
.Y=zero(μ)
: the initial gradient with respct toμ
Z=zero(λ)
: the initial gradient with respct toλ
W=zero(s)
: the initial gradient with respct tos
As well as internal keywords used to set up these given keywords like _step_M
, _step_p
, _sub_M
, _sub_p
, and _sub_X
, that should not be changed.
All other keyword arguments are passed to decorate_state!
for state decorators or decorate_objective!
for objective, respectively.
The centrality_condition=mising
disables to check centrality during the line search, but you can pass InteriorPointCentralityCondition
(cmo, γ)
, where γ
is a constant, to activate this check.
Output
The obtained approximate constrained minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return
for details, especially the return_state=
keyword.
State
Manopt.InteriorPointNewtonState
— TypeInteriorPointNewtonState{P,T} <: AbstractHessianSolverState
Fields
λ
: the Lagrange multiplier with respect to the equality constraintsμ
: the Lagrange multiplier with respect to the inequality constraintsp::P
: a point on the manifold $\mathcal M$storing the current iterates
: the current slack variablesub_problem::Union{AbstractManoptProblem, F}
: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.sub_state::Union{AbstractManoptProblem, F}
: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.X
: the current gradient with respect top
Y
: the current gradient with respect toμ
Z
: the current gradient with respect toλ
W
: the current gradient with respect tos
ρ
: store the orthogonalityμ's/m
to compute the barrier parameterβ
in the sub problemσ
: scaling factor for the barrier parameterβ
in the sub problemstop::StoppingCriterion
: a functor indicating that the stopping criterion is fulfilledretraction_method::AbstractRetractionMethod
: a retraction $\operatorname{retr}$ to use, see the section on retractionsstepsize::Stepsize
: a functor inheriting fromStepsize
to determine a step sizestep_problem
: anAbstractManoptProblem
storing the manifold and objective for the line searchstep_state
: storing iterate and search direction in a state for the line search, seeStepsizeState
Constructor
InteriorPointNewtonState(
M::AbstractManifold,
cmo::ConstrainedManifoldObjective,
sub_problem::Pr,
sub_state::St;
kwargs...
)
Initialize the state, where both the AbstractManifold
and the ConstrainedManifoldObjective
are used to fill in reasonable defaults for the keywords.
Input
M::
AbstractManifold
: a Riemannian manifold $\mathcal M$
cmo
: aConstrainedManifoldObjective
sub_problem
: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.sub_state
: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.
Keyword arguments
Let m
and n
denote the number of inequality and equality constraints, respectively
p=
rand
(M)
: a point on the manifold $\mathcal M$to specify the initial valueμ=ones(m)
X=
zero_vector
(M,p)
Y=zero(μ)
λ=zeros(n)
Z=zero(λ)
s=ones(m)
W=zero(s)
ρ=μ's/m
σ=
calculate_σ
(M, cmo, p, μ, λ, s)
stopping_criterion=
StopAfterIteration
(200)
|
StopWhenChangeLess
(1e-8)
: a functor indicating that the stopping criterion is fulfilledretraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsstep_objective=
ManifoldGradientObjective
(
KKTVectorFieldNormSq
(cmo)
,KKTVectorFieldNormSqGradient
(cmo)
; evaluation=InplaceEvaluation
())
vector_space=
Rn
: a function that, given an integer, returns the manifold to be used for the vector space components $ℝ^m,ℝ^n$step_problem
: wrap the manifold $\mathcal M × ℝ^m × ℝ^n × ℝ^m$step_state
: theStepsizeState
with point and search directionstepsize=
ArmijoLinesearch
()
: a functor inheriting fromStepsize
to determine a step size with theInteriorPointCentralityCondition
as additional condition to accept a step
and internally _step_M
and _step_p
for the manifold and point in the stepsize.
Subproblem functions
Manopt.CondensedKKTVectorField
— TypeCondensedKKTVectorField{O<:ConstrainedManifoldObjective,T,R} <: AbstractConstrainedSlackFunctor{T,R}
Given the constrained optimization problem
\[\begin{aligned} \min_{p ∈\mathcal{M}} &f(p)\\ \text{subject to } &g_i(p)\leq 0 \quad \text{ for } i= 1, …, m,\\ \quad &h_j(p)=0 \quad \text{ for } j=1,…,n, \end{aligned}\]
Then reformulating the KKT conditions of the Lagrangian from the optimality conditions of the Lagrangian
\[\mathcal L(p, μ, λ) = f(p) + \sum_{j=1}^n λ_jh_j(p) + \sum_{i=1}^m μ_ig_i(p)\]
in a perturbed / barrier method in a condensed form using a slack variable $s ∈ ℝ^m$ and a barrier parameter $β$ and the Riemannian gradient of the Lagrangian with respect to the first parameter $\operatorname{grad}_p L(p, μ, λ)$.
Let $\mathcal N = \mathcal M × ℝ^n$. We obtain the linear system
\[\mathcal A(p,λ)[X,Y] = -b(p,λ),\qquad \text{where } (X,Y) ∈ T_{(p,λ)}\mathcal N\]
where $\mathcal A: T_{(p,λ)}\mathcal N → T_{(p,λ)}\mathcal N$ is a linear operator and this struct models the right hand side $b(p,λ) ∈ T_{(p,λ)}\mathcal M$ given by
\[b(p,λ) = \begin{pmatrix} \operatorname{grad} f(p) + \displaystyle\sum_{j=1}^n λ_j \operatorname{grad} h_j(p) + \displaystyle\sum_{i=1}^m μ_i \operatorname{grad} g_i(p) + \displaystyle\sum_{i=1}^m \frac{μ_i}{s_i}\bigl( μ_i(g_i(p)+s_i) + β - μ_is_i \bigr)\operatorname{grad} g_i(p)\\ h(p) \end{pmatrix}\]
Fields
cmo
theConstrainedManifoldObjective
μ::T
the vector in $ℝ^m$ of coefficients for the inequality constraintss::T
the vector in $ℝ^m$ of sclack variablesβ::R
the barrier parameter $β∈ℝ$
Constructor
CondensedKKTVectorField(cmo, μ, s, β)
Manopt.CondensedKKTVectorFieldJacobian
— TypeCondensedKKTVectorFieldJacobian{O<:ConstrainedManifoldObjective,T,R} <: AbstractConstrainedSlackFunctor{T,R}
Given the constrained optimization problem
\[\begin{aligned} \min_{p ∈\mathcal{M}} &f(p)\\ \text{subject to } &g_i(p)\leq 0 \quad \text{ for } i= 1, …, m,\\ \quad &h_j(p)=0 \quad \text{ for } j=1,…,n, \end{aligned}\]
we reformulate the KKT conditions of the Lagrangian from the optimality conditions of the Lagrangian
\[\mathcal L(p, μ, λ) = f(p) + \sum_{j=1}^n λ_jh_j(p) + \sum_{i=1}^m μ_ig_i(p)\]
in a perturbed / barrier method enhanced as well as condensed form as using $\operatorname{grad}_o L(p, μ, λ)$ the Riemannian gradient of the Lagrangian with respect to the first parameter.
Let $\mathcal N = \mathcal M × ℝ^n$. We obtain the linear system
\[\mathcal A(p,λ)[X,Y] = -b(p,λ),\qquad \text{where } X ∈ T_p\mathcal M, Y ∈ ℝ^n\]
where $\mathcal A: T_{(p,λ)}\mathcal N → T_{(p,λ)}\mathcal N$ is a linear operator on $T_{(p,λ)}\mathcal N = T_p\mathcal M × ℝ^n$ given by
\[\mathcal A(p,λ)[X,Y] = \begin{pmatrix} \operatorname{Hess}_p\mathcal L(p, μ, λ)[X] + \displaystyle\sum_{i=1}^m \frac{μ_i}{s_i}⟨\operatorname{grad} g_i(p), X⟩\operatorname{grad} g_i(p) + \displaystyle\sum_{j=1}^n Y_j \operatorname{grad} h_j(p) \\ \Bigl( ⟨\operatorname{grad} h_j(p), X⟩ \Bigr)_{j=1}^n \end{pmatrix}\]
Fields
cmo
theConstrainedManifoldObjective
μ::V
the vector in $ℝ^m$ of coefficients for the inequality constraintss::V
the vector in $ℝ^m$ of slack variablesβ::R
the barrier parameter $β∈ℝ$
Constructor
CondensedKKTVectorFieldJacobian(cmo, μ, s, β)
Manopt.KKTVectorField
— TypeKKTVectorField{O<:ConstrainedManifoldObjective}
Implement the vectorfield $F$ KKT-conditions, inlcuding a slack variable for the inequality constraints.
Given the LagrangianCost
\[\mathcal L(p; μ, λ) = f(p) + \sum_{i=1}^m μ_ig_i(p) + \sum_{j=1}^n λ_jh_j(p)\]
\[\operatorname{grad}\mathcal L(p, μ, λ) = \operatorname{grad}f(p) + \sum_{j=1}^n λ_j \operatorname{grad} h_j(p) + \sum_{i=1}^m μ_i \operatorname{grad} g_i(p),\]
and introducing the slack variables $s=-g(p) ∈ ℝ^m$ the vector field is given by
\[F(p, μ, λ, s) = \begin{pmatrix} \operatorname{grad}_p \mathcal L(p, μ, λ)\\ g(p) + s\\ h(p)\\ μ ⊙ s \end{pmatrix}, \text{ where } p \in \mathcal M, μ, s \in ℝ^m\text{ and } λ \in ℝ^n,\]
where $⊙$ denotes the Hadamard (or elementwise) product
Fields
cmo
theConstrainedManifoldObjective
While the point p
is arbitrary and usually not needed, it serves as internal memory in the computations. Furthermore Both fields together also calrify the product manifold structure to use.
Constructor
KKTVectorField(cmo::ConstrainedManifoldObjective)
Example
Define F = KKTVectorField(cmo)
for some ConstrainedManifoldObjective
cmo
and let N
be the product manifold of $\mathcal M×ℝ^m×ℝ^n×ℝ^m$. Then, you can call this cost as F(N, q)
or as the in-place variant F(N, Y, q)
, where q
is a point on N
and Y
is a tangent vector at q
for the result.
Manopt.KKTVectorFieldJacobian
— TypeKKTVectorFieldJacobian{O<:ConstrainedManifoldObjective}
Implement the Jacobian of the vector field $F$ of the KKT-conditions, inlcuding a slack variable for the inequality constraints, see KKTVectorField
and KKTVectorFieldAdjointJacobian
..
\[\operatorname{J} F(p, μ, λ, s)[X, Y, Z, W] = \begin{pmatrix} \operatorname{Hess}_p \mathcal L(p, μ, λ)[X] + \displaystyle\sum_{i=1}^m Y_i \operatorname{grad} g_i(p) + \displaystyle\sum_{j=1}^n Z_j \operatorname{grad} h_j(p)\\ \Bigl( ⟨\operatorname{grad} g_i(p), X⟩ + W_i\Bigr)_{i=1}^m\\ \Bigl( ⟨\operatorname{grad} h_j(p), X⟩ \Bigr)_{j=1}^n\\ μ ⊙ W + s ⊙ Y \end{pmatrix},\]
where $⊙$ denotes the Hadamard (or elementwise) product
See also the LagrangianHessian
$\operatorname{Hess}_p \mathcal L(p, μ, λ)[X]$.
Fields
cmo
theConstrainedManifoldObjective
Constructor
KKTVectorFieldJacobian(cmo::ConstrainedManifoldObjective)
Generate the Jacobian of the KKT vector field related to some ConstrainedManifoldObjective
cmo
.
Example
Define JF = KKTVectorFieldJacobian(cmo)
for some ConstrainedManifoldObjective
cmo
and let N
be the product manifold of $\mathcal M×ℝ^m×ℝ^n×ℝ^m$. Then, you can call this cost as JF(N, q, Y)
or as the in-place variant JF(N, Z, q, Y)
, where q
is a point on N
and Y
and Z
are a tangent vector at q
.
Manopt.KKTVectorFieldAdjointJacobian
— TypeKKTVectorFieldAdjointJacobian{O<:ConstrainedManifoldObjective}
Implement the Adjoint of the Jacobian of the vector field $F$ of the KKT-conditions, inlcuding a slack variable for the inequality constraints, see KKTVectorField
and KKTVectorFieldJacobian
.
\[\operatorname{J}^* F(p, μ, λ, s)[X, Y, Z, W] = \begin{pmatrix} \operatorname{Hess}_p \mathcal L(p, μ, λ)[X] + \displaystyle\sum_{i=1}^m Y_i \operatorname{grad} g_i(p) + \displaystyle\sum_{j=1}^n Z_j \operatorname{grad} h_j(p)\\ \Bigl( ⟨\operatorname{grad} g_i(p), X⟩ + s_iW_i\Bigr)_{i=1}^m\\ \Bigl( ⟨\operatorname{grad} h_j(p), X⟩ \Bigr)_{j=1}^n\\ μ ⊙ W + Y \end{pmatrix},\]
where $⊙$ denotes the Hadamard (or elementwise) product
See also the LagrangianHessian
$\operatorname{Hess}_p \mathcal L(p, μ, λ)[X]$.
Fields
cmo
theConstrainedManifoldObjective
Constructor
KKTVectorFieldAdjointJacobian(cmo::ConstrainedManifoldObjective)
Generate the Adjoint Jacobian of the KKT vector field related to some ConstrainedManifoldObjective
cmo
.
Example
Define AdJF = KKTVectorFieldAdjointJacobian(cmo)
for some ConstrainedManifoldObjective
cmo
and let N
be the product manifold of $\mathcal M×ℝ^m×ℝ^n×ℝ^m$. Then, you can call this cost as AdJF(N, q, Y)
or as the in-place variant AdJF(N, Z, q, Y)
, where q
is a point on N
and Y
and Z
are a tangent vector at q
.
Manopt.KKTVectorFieldNormSq
— TypeKKTVectorFieldNormSq{O<:ConstrainedManifoldObjective}
Implement the square of the norm of the vectorfield $F$ of the KKT-conditions, inlcuding a slack variable for the inequality constraints, see KKTVectorField
, where this functor applies the norm to. In [LY24] this is called the merit function.
Fields
cmo
theConstrainedManifoldObjective
Constructor
KKTVectorFieldNormSq(cmo::ConstrainedManifoldObjective)
Example
Define f = KKTVectorFieldNormSq(cmo)
for some ConstrainedManifoldObjective
cmo
and let N
be the product manifold of $\mathcal M×ℝ^m×ℝ^n×ℝ^m$. Then, you can call this cost as f(N, q)
, where q
is a point on N
.
Manopt.KKTVectorFieldNormSqGradient
— TypeKKTVectorFieldNormSqGradient{O<:ConstrainedManifoldObjective}
Compute the gradient of the KKTVectorFieldNormSq
$φ(p,μ,λ,s) = \lVert F(p,μ,λ,s)\rVert^2$, that is of the norm squared of the KKTVectorField
$F$.
This is given in [LY24] as the gradient of their merit function, which we can write with the adjoint $J^*$ of the Jacobian
\[\operatorname{grad} φ = 2\operatorname{J}^* F(p, μ, λ, s)[F(p, μ, λ, s)],\]
and hence is computed with KKTVectorFieldAdjointJacobian
and KKTVectorField
.
For completeness, the gradient reads, using the LagrangianGradient
$L = \operatorname{grad}_p \mathcal L(p,μ,λ) ∈ T_p\mathcal M$, for a shorthand of the first component of $F$, as
\[\operatorname{grad} φ = 2 \begin{pmatrix} \operatorname{grad}_p \mathcal L(p,μ,λ)[L] + (g_i(p) + s_i)\operatorname{grad} g_i(p) + h_j(p)\operatorname{grad} h_j(p)\\ \Bigl( ⟨\operatorname{grad} g_i(p), L⟩ + s_i\Bigr)_{i=1}^m + μ ⊙ s ⊙ s\\ \Bigl( ⟨\operatorname{grad} h_j(p), L⟩ \Bigr)_{j=1}^n\\ g + s + μ ⊙ μ ⊙ s \end{pmatrix},\]
where $⊙$ denotes the Hadamard (or elementwise) product.
Fields
cmo
theConstrainedManifoldObjective
Constructor
KKTVectorFieldNormSqGradient(cmo::ConstrainedManifoldObjective)
Example
Define grad_f = KKTVectorFieldNormSqGradient(cmo)
for some ConstrainedManifoldObjective
cmo
and let N
be the product manifold of $\mathcal M×ℝ^m×ℝ^n×ℝ^m$. Then, you can call this cost as grad_f(N, q)
or as the in-place variant grad_f(N, Y, q)
, where q
is a point on N
and Y
is a tangent vector at q
returning the resulting gradient at.
Helpers
Manopt.InteriorPointCentralityCondition
— TypeInteriorPointCentralityCondition{CO,R}
A functor to check the centrality condition.
In order to obtain a step in the linesearch performed within the interior_point_Newton
, Section 6 of [LY24] propose the following additional conditions to hold inspired by the Euclidean case described in Section 6 [ETTZ96]:
For a given ConstrainedManifoldObjective
assume consider the KKTVectorField
$F$, that is we are at a point $q = (p, λ, μ, s)$ on $\mathcal M × ℝ^m × ℝ^n × ℝ^m$and a search direction $V = (X, Y, Z, W)$.
Then, let
\[τ_1 = \frac{m⋅\min\{ μ ⊙ s\}}{μ^{\mathrm{T}}s} \quad\text{ and }\quad τ_2 = \frac{μ^{\mathrm{T}}s}{\lVert F(q) \rVert}\]
where $⊙$ denotes the Hadamard (or elementwise) product.
For a new candidate $q(α) = \bigl(p(α), λ(α), μ(α), s(α)\bigr) := (\operatorname{retr}_p(αX), λ+αY, μ+αZ, s+αW)$, we then define two functions
\[c_1(α) = \min\{ μ(α) ⊙ s(α) \} - \frac{γτ_1 μ(α)^{\mathrm{T}}s(α)}{m} \quad\text{ and }\quad c_2(α) = μ(α)^{\mathrm{T}}s(α) – γτ_2 \lVert F(q(α)) \rVert.\]
While the paper now states that the (Armijo) linesearch starts at a point $\tilde α$, it is easier to include the condition that $c_1(α) ≥ 0$ and $c_2(α) ≥ 0$ into the linesearch as well.
The functor InteriorPointCentralityCondition(cmo, γ, μ, s, normKKT)(N,qα)
defined here evaluates this condition and returns true if both $c_1$ and $c_2$ are nonnegative.
Fields
cmo
: aConstrainedManifoldObjective
γ
: a constantτ1
,τ2
: the constants given in the formula.
Constructor
InteriorPointCentralityCondition(cmo, γ)
InteriorPointCentralityCondition(cmo, γ, τ1, τ2)
Initialise the centrality conditions. The parameters τ1
, τ2
are initialise to zero if not provided.
Besides get_parameter
for all three constants, and set_parameter!
for $γ$, to update $τ_1$ and $τ_2$, call set_parameter(ipcc, :τ, N, q)
to update both $τ_1$ and $τ_2$ according to the formulae above.
Manopt.calculate_σ
— Functioncalculate_σ(M, cmo, p, μ, λ, s; kwargs...)
Compute the new $σ$ factor for the barrier parameter in interior_point_Newton
as
\[\min\{\frac{1}{2}, \lVert F(p; μ, λ, s)\rVert^{\frac{1}{2}} \},\]
where $F$ is the KKT vector field, hence the KKTVectorFieldNormSq
is used.
Keyword arguments
vector_space=
Rn
a function that, given an integer, returns the manifold to be used for the vector space components $ℝ^m,ℝ^n$N
the manifold $\mathcal M × ℝ^m × ℝ^n × ℝ^m$ the vector field lives on (generated usingvector_space
)q
provide memory onN
for interims evaluation of the vector field
Additional stopping criteria
Manopt.StopWhenKKTResidualLess
— TypeStopWhenKKTResidualLess <: StoppingCriterion
Stop when the KKT residual
r^2
= \lVert \operatorname{grad}_p \mathcal L(p, μ, λ) \rVert^2
+ \sum_{i=1}^m [μ_i]_{-}^2 + [g_i(p)]_+^2 + \lvert \mu_ig_i(p)^2
+ \sum_{j=1}^n \lvert h_i(p)\rvert^2.
is less than a given threshold $r < ε$. We use $[v]_+ = \max\{0,v\}$ and $[v]_- = \min\{0,t\}$ for the positive and negative part of $v$, respectively
Fields
ε
: a thresholdresidual
: store the last residual if the stopping criterion is hit.at_iteration
:
References
- [ETTZ96]
- A. S. El-Bakry, R. A. Tapia, T. Tsuchiya and Y. Zhang. On the formulation and theory of the Newton interior-point method for nonlinear programming. Journal of Optimization Theory and Applications 89, 507–541 (1996).
- [LY24]
- Z. Lai and A. Yoshise. Riemannian Interior Point Methods for Constrained Optimization on Manifolds. Journal of Optimization Theory and Applications 201, 433–469 (2024), arXiv:2203.09762.