Interior Point Newton method
Manopt.interior_point_Newton — Functioninterior_point_Newton(M, f, grad_f, Hess_f, p=rand(M); kwargs...)
interior_point_Newton(M, cmo::ConstrainedManifoldObjective, p=rand(M); kwargs...)
interior_point_Newton!(M, f, grad_f, Hess_f, p; kwargs...)
interior_point_Newton(M, ConstrainedManifoldObjective, p; kwargs...)perform the interior point Newton method following [LY24].
In order to solve the constrained problem
\[\begin{aligned} \min_{p ∈\mathcal{M}} &f(p)\\ \text{subject to } &g_i(p)\leq 0 \quad \text{ for } i= 1, …, m,\\ \quad &h_j(p)=0 \quad \text{ for } j=1,…,n, \end{aligned}\]
This algorithms iteratively solves the linear system based on extending the KKT system by a slack variable s.
\[\operatorname{J} F(p, μ, λ, s)[X, Y, Z, W] = -F(p, μ, λ, s), \text{ where } X ∈ T_p\mathcal M, Y,W ∈ ℝ^m, Z ∈ ℝ^n,\]
see CondensedKKTVectorFieldJacobian and CondensedKKTVectorField, respectively, for the reduced form, this is usually solved in. From the resulting X and Z in the reeuced form, the other two, $Y$, $W$, are then computed.
From the gradient $(X,Y,Z,W)$ at the current iterate $(p, μ, λ, s)$, a line search is performed using the KKTVectorFieldNormSq norm of the KKT vector field (squared) and its gradient KKTVectorFieldNormSqGradient together with the InteriorPointCentralityCondition.
Note that since the vector field $F$ includes the gradients of the constraint functions $g, h$, its gradient or Jacobian requires the Hessians of the constraints.
For that seach direction a line search is performed, that additionally ensures that the constraints are further fulfilled.
Input
M: a manifold $\mathcal M$f: a cost function $f : \mathcal M → ℝ$ to minimizegrad_f: the gradient $\operatorname{grad} f : \mathcal M → T \mathcal M$ of $f$Hess_f: the Hessian $\operatorname{Hess}f(p): T_p\mathcal M → T_p\mathcal M$, $X ↦ \operatorname{Hess}f(p)[X] = ∇_X\operatorname{grad}f(p)$p=rand(M): an initial value $p ∈ \mathcal M$
or a ConstrainedManifoldObjective cmo containing f, grad_f, Hess_f, and the constraints
Keyword arguments
The keyword arguments related to the constraints (the first eleven) are ignored if you pass a ConstrainedManifoldObjective cmo
centrality_condition=missing; an additional condition when to accept a step size. This can be used to ensure that the resulting iterate is still an interior point if you provide a check(N,q) -> true/false, whereNis the manifold of thestep_problem.equality_constraints=nothing: the number $n$ of equality constraints.evaluation=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.g=nothing: the inequality constraintsgrad_g=nothing: the gradient of the inequality constraintsgrad_h=nothing: the gradient of the equality constraintsgradient_range=nothing: specify how gradients are represented, wherenothingis equivalent toNestedPowerRepresentationgradient_equality_range=gradient_range: specify how the gradients of the equality constraints are representedgradient_inequality_range=gradient_range: specify how the gradients of the inequality constraints are representedh=nothing: the equality constraintsHess_g=nothing: the Hessian of the inequality constraintsHess_h=nothing: the Hessian of the equality constraintsinequality_constraints=nothing: the number $m$ of inequality constraints.λ=ones(length(h(M, p))): the Lagrange multiplier with respect to the equality constraints $h$μ=ones(length(g(M, p))): the Lagrange multiplier with respect to the inequality constraints $g$retraction_method=default_retraction_method(M, typeof(p)): the retraction to use, defaults to the default setMwith respect to the representation forpchosen.ρ=μ's / length(μ): store the orthogonalityμ's/mto compute the barrier parameterβin the sub problem.s=copy(μ): initial value for the slack variablesσ=calculate_σ(M, cmo, p, μ, λ, s): scaling factor for the barrier parameterβin the sub problem, which is updated during the iterationsstep_objective: aManifoldGradientObjectiveof the norm of the KKT vector fieldKKTVectorFieldNormSqand its gradientKKTVectorFieldNormSqGradientstep_problem: the manifold $\mathcal M × ℝ^m × ℝ^n × ℝ^m$ together with thestep_objectiveas the problem the linesearchstepsize=employs for determining a step sizestep_state: theStepsizeStatewith point and search directionstepsizeanArmijoLinesearchwith theInteriorPointCentralityConditionas additional condition to accept a step. Note that this step size operates on its ownstep_problemandstep_statestopping_criterion=StopAfterIteration(200)|StopWhenKKTResidualLess(1e-8): a stopping criterion, by default depending on the residual of the KKT vector field or a maximal number of steps, which ever hits first.sub_kwargs=(;): keyword arguments to decorate the sub options, for example debug, that automatically respects the main solvers debug options (like sub-sampling) as wellsub_objective: TheSymmetricLinearSystemObjectivemodelling the system of equations to use in the sub solver, includes theCondensedKKTVectorFieldJacobian$\mathcal A(X)$ and theCondensedKKTVectorField$b$ in $\mathcal A(X) + b = 0$ we aim to solve. This is used to setup thesub_problem. If you set thesub_problemdirectly, this keyword has no effect.sub_stopping_criterion=StopAfterIteration(manifold_dimension(M))|StopWhenRelativeResidualLess(c,1e-8), where $c = \lVert b \rVert$ from the system to solve. This keyword is used in thesub_state. If you set that keyword diretly, this keyword does not have an effect.sub_problem: combining thesub_objectiveand the tangent space at $(p,λ)$on the manifold\mathcal M × ℝ^n` to a manopt problem. This is the manifold and objective for the sub solver.sub_state=ConjugateResidualState: a state specifying the subsolver. This default is also decorated with thesub_kwargs....vector_space=Rna function that, given an integer, returns the manifold to be used for the vector space components $ℝ^m,ℝ^n$X=zero_vector(M,p): th initial gradient with respect top.Y=zero(μ): the initial gradient with respct toμZ=zero(λ): the initial gradient with respct toλW=zero(s): the initial gradient with respct tos
As well as internal keywords used to set up these given keywords like _step_M, _step_p, _sub_M, _sub_p, and _sub_X, that should not be changed.
All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective, respectively.
The centrality_condition=mising disables to check centrality during the line search, but you can pass InteriorPointCentralityCondition(cmo, γ), where γ is a constant, to activate this check.
Output
The obtained approximate constrained minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.
Manopt.interior_point_Newton! — Functioninterior_point_Newton(M, f, grad_f, Hess_f, p=rand(M); kwargs...)
interior_point_Newton(M, cmo::ConstrainedManifoldObjective, p=rand(M); kwargs...)
interior_point_Newton!(M, f, grad_f, Hess_f, p; kwargs...)
interior_point_Newton(M, ConstrainedManifoldObjective, p; kwargs...)perform the interior point Newton method following [LY24].
In order to solve the constrained problem
\[\begin{aligned} \min_{p ∈\mathcal{M}} &f(p)\\ \text{subject to } &g_i(p)\leq 0 \quad \text{ for } i= 1, …, m,\\ \quad &h_j(p)=0 \quad \text{ for } j=1,…,n, \end{aligned}\]
This algorithms iteratively solves the linear system based on extending the KKT system by a slack variable s.
\[\operatorname{J} F(p, μ, λ, s)[X, Y, Z, W] = -F(p, μ, λ, s), \text{ where } X ∈ T_p\mathcal M, Y,W ∈ ℝ^m, Z ∈ ℝ^n,\]
see CondensedKKTVectorFieldJacobian and CondensedKKTVectorField, respectively, for the reduced form, this is usually solved in. From the resulting X and Z in the reeuced form, the other two, $Y$, $W$, are then computed.
From the gradient $(X,Y,Z,W)$ at the current iterate $(p, μ, λ, s)$, a line search is performed using the KKTVectorFieldNormSq norm of the KKT vector field (squared) and its gradient KKTVectorFieldNormSqGradient together with the InteriorPointCentralityCondition.
Note that since the vector field $F$ includes the gradients of the constraint functions $g, h$, its gradient or Jacobian requires the Hessians of the constraints.
For that seach direction a line search is performed, that additionally ensures that the constraints are further fulfilled.
Input
M: a manifold $\mathcal M$f: a cost function $f : \mathcal M → ℝ$ to minimizegrad_f: the gradient $\operatorname{grad} f : \mathcal M → T \mathcal M$ of $f$Hess_f: the Hessian $\operatorname{Hess}f(p): T_p\mathcal M → T_p\mathcal M$, $X ↦ \operatorname{Hess}f(p)[X] = ∇_X\operatorname{grad}f(p)$p=rand(M): an initial value $p ∈ \mathcal M$
or a ConstrainedManifoldObjective cmo containing f, grad_f, Hess_f, and the constraints
Keyword arguments
The keyword arguments related to the constraints (the first eleven) are ignored if you pass a ConstrainedManifoldObjective cmo
centrality_condition=missing; an additional condition when to accept a step size. This can be used to ensure that the resulting iterate is still an interior point if you provide a check(N,q) -> true/false, whereNis the manifold of thestep_problem.equality_constraints=nothing: the number $n$ of equality constraints.evaluation=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.g=nothing: the inequality constraintsgrad_g=nothing: the gradient of the inequality constraintsgrad_h=nothing: the gradient of the equality constraintsgradient_range=nothing: specify how gradients are represented, wherenothingis equivalent toNestedPowerRepresentationgradient_equality_range=gradient_range: specify how the gradients of the equality constraints are representedgradient_inequality_range=gradient_range: specify how the gradients of the inequality constraints are representedh=nothing: the equality constraintsHess_g=nothing: the Hessian of the inequality constraintsHess_h=nothing: the Hessian of the equality constraintsinequality_constraints=nothing: the number $m$ of inequality constraints.λ=ones(length(h(M, p))): the Lagrange multiplier with respect to the equality constraints $h$μ=ones(length(g(M, p))): the Lagrange multiplier with respect to the inequality constraints $g$retraction_method=default_retraction_method(M, typeof(p)): the retraction to use, defaults to the default setMwith respect to the representation forpchosen.ρ=μ's / length(μ): store the orthogonalityμ's/mto compute the barrier parameterβin the sub problem.s=copy(μ): initial value for the slack variablesσ=calculate_σ(M, cmo, p, μ, λ, s): scaling factor for the barrier parameterβin the sub problem, which is updated during the iterationsstep_objective: aManifoldGradientObjectiveof the norm of the KKT vector fieldKKTVectorFieldNormSqand its gradientKKTVectorFieldNormSqGradientstep_problem: the manifold $\mathcal M × ℝ^m × ℝ^n × ℝ^m$ together with thestep_objectiveas the problem the linesearchstepsize=employs for determining a step sizestep_state: theStepsizeStatewith point and search directionstepsizeanArmijoLinesearchwith theInteriorPointCentralityConditionas additional condition to accept a step. Note that this step size operates on its ownstep_problemandstep_statestopping_criterion=StopAfterIteration(200)|StopWhenKKTResidualLess(1e-8): a stopping criterion, by default depending on the residual of the KKT vector field or a maximal number of steps, which ever hits first.sub_kwargs=(;): keyword arguments to decorate the sub options, for example debug, that automatically respects the main solvers debug options (like sub-sampling) as wellsub_objective: TheSymmetricLinearSystemObjectivemodelling the system of equations to use in the sub solver, includes theCondensedKKTVectorFieldJacobian$\mathcal A(X)$ and theCondensedKKTVectorField$b$ in $\mathcal A(X) + b = 0$ we aim to solve. This is used to setup thesub_problem. If you set thesub_problemdirectly, this keyword has no effect.sub_stopping_criterion=StopAfterIteration(manifold_dimension(M))|StopWhenRelativeResidualLess(c,1e-8), where $c = \lVert b \rVert$ from the system to solve. This keyword is used in thesub_state. If you set that keyword diretly, this keyword does not have an effect.sub_problem: combining thesub_objectiveand the tangent space at $(p,λ)$on the manifold\mathcal M × ℝ^n` to a manopt problem. This is the manifold and objective for the sub solver.sub_state=ConjugateResidualState: a state specifying the subsolver. This default is also decorated with thesub_kwargs....vector_space=Rna function that, given an integer, returns the manifold to be used for the vector space components $ℝ^m,ℝ^n$X=zero_vector(M,p): th initial gradient with respect top.Y=zero(μ): the initial gradient with respct toμZ=zero(λ): the initial gradient with respct toλW=zero(s): the initial gradient with respct tos
As well as internal keywords used to set up these given keywords like _step_M, _step_p, _sub_M, _sub_p, and _sub_X, that should not be changed.
All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective, respectively.
The centrality_condition=mising disables to check centrality during the line search, but you can pass InteriorPointCentralityCondition(cmo, γ), where γ is a constant, to activate this check.
Output
The obtained approximate constrained minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.
State
Manopt.InteriorPointNewtonState — TypeInteriorPointNewtonState <: AbstractHessianSolverStateFields
λ: the Lagrange multiplier with respect to the equality constraintsμ: the Lagrange multiplier with respect to the inequality constraintsp: the current iterates: the current slack variablesub_problem: anAbstractManoptProblemproblem for the subsolversub_state: anAbstractManoptSolverStatefor the subsolverX: the current gradient with respect topY: the current gradient with respect toμZ: the current gradient with respect toλW: the current gradient with respect tosρ: store the orthogonalityμ's/mto compute the barrier parameterβin the sub problemσ: scaling factor for the barrier parameterβin the sub problemstop: aStoppingCriterionindicating when to stop.retraction_method: the retraction method to use onM.stepsize::Stepsize: the stepsize to usestep_problem: anAbstractManoptProblemstoring the manifold and objective for the line searchstep_state: storing iterate and search direction in a state for the line search, seeStepsizeState
Constructor
InteriorPointNewtonState(
M::AbstractManifold,
cmo::ConstrainedManifoldObjective,
p,
sub_problem::Pr,
sub_state::St;
kwargs...
)Initialize the state, where both the AbstractManifold and the ConstrainedManifoldObjective are used to fill in reasonable defaults for the keywords.
Input
M: a Riemannian manifoldcmo: aConstrainedManifoldObjectivep: a point onMas the inital point of the algorithmsub_problem: anAbstractManoptProblemproblem for the sub solversub_state: anAbstractManoptSolverStatefor the sub solver
Keyword arguments
Let m and n denote the number of inequality and equality constraints, respectively
μ=ones(m)X=zero_vector(M,p)Y=zero(μ)λ=zeros(n)Z=zero(λ)s=ones(m)W=zero(s)ρ=μ's/mσ=calculate_σ(M, cmo, p, μ, λ, s)stopping_criterion=StopAfterIteration(200)|StopWhenChangeLess(1e-8)retraction_method=default_retraction_method(M, typeof(p))step_objective=ManifoldGradientObjective(KKTVectorFieldNormSq(cmo),KKTVectorFieldNormSqGradient(cmo); evaluation=InplaceEvaluation())vector_space=Rn: a function that, given an integer, returns the manifold to be used for the vector space components $ℝ^m,ℝ^n$step_problem: wrap the manifold $\mathcal M × ℝ^m × ℝ^n × ℝ^m$step_state: theStepsizeStatewith point and search directionstepsize: anArmijoLinesearchwith theInteriorPointCentralityConditionas additional condition to accept a step. Note that this step size operates on its ownstep_problemandstep_state
and internally _step_M and _step_p for the manifold and point in the stepsize.
Subproblem functions
Manopt.CondensedKKTVectorField — TypeCondensedKKTVectorField{O<:ConstrainedManifoldObjective,T,R} <: AbstractConstrainedSlackFunctor{T,R}Given the constrained optimization problem
\[\begin{aligned} \min_{p ∈\mathcal{M}} &f(p)\\ \text{subject to } &g_i(p)\leq 0 \quad \text{ for } i= 1, …, m,\\ \quad &h_j(p)=0 \quad \text{ for } j=1,…,n, \end{aligned}\]
Then reformulating the KKT conditions of the Lagrangian from the optimality conditions of the Lagrangian
\[\mathcal L(p, μ, λ) = f(p) + \sum_{j=1}^n λ_jh_j(p) + \sum_{i=1}^m μ_ig_i(p)\]
in a perturbed / barrier method in a condensed form using a slack variable $s ∈ ℝ^m$ and a barrier parameter $β$ and the Riemannian gradient of the Lagrangian with respect to the first parameter $\operatorname{grad}_p L(p, μ, λ)$.
Let $\mathcal N = \mathcal M × ℝ^n$. We obtain the linear system
\[\mathcal A(p,λ)[X,Y] = -b(p,λ),\qquad \text{where } (X,Y) ∈ T_{(p,λ)}\mathcal N\]
where $\mathcal A: T_{(p,λ)}\mathcal N → T_{(p,λ)}\mathcal N$ is a linear operator and this struct models the right hand side $b(p,λ) ∈ T_{(p,λ)}\mathcal M$ given by
\[b(p,λ) = \begin{pmatrix} \operatorname{grad} f(p) + \displaystyle\sum_{j=1}^n λ_j \operatorname{grad} h_j(p) + \displaystyle\sum_{i=1}^m μ_i \operatorname{grad} g_i(p) + \displaystyle\sum_{i=1}^m \frac{μ_i}{s_i}\bigl( μ_i(g_i(p)+s_i) + β - μ_is_i \bigr)\operatorname{grad} g_i(p)\\ h(p) \end{pmatrix}\]
Fields
cmotheConstrainedManifoldObjectiveμ::Tthe vector in $ℝ^m$ of coefficients for the inequality constraintss::Tthe vector in $ℝ^m$ of sclack variablesβ::Rthe barrier parameter $β∈ℝ$
Constructor
CondensedKKTVectorField(cmo, μ, s, β)Manopt.CondensedKKTVectorFieldJacobian — TypeCondensedKKTVectorFieldJacobian{O<:ConstrainedManifoldObjective,T,R} <: AbstractConstrainedSlackFunctor{T,R}Given the constrained optimization problem
\[\begin{aligned} \min_{p ∈\mathcal{M}} &f(p)\\ \text{subject to } &g_i(p)\leq 0 \quad \text{ for } i= 1, …, m,\\ \quad &h_j(p)=0 \quad \text{ for } j=1,…,n, \end{aligned}\]
we reformulate the KKT conditions of the Lagrangian from the optimality conditions of the Lagrangian
\[\mathcal L(p, μ, λ) = f(p) + \sum_{j=1}^n λ_jh_j(p) + \sum_{i=1}^m μ_ig_i(p)\]
in a perturbed / barrier method enhanced as well as condensed form as using $\operatorname{grad}_o L(p, μ, λ)$ the Riemannian gradient of the Lagrangian with respect to the first parameter.
Let $\mathcal N = \mathcal M × ℝ^n$. We obtain the linear system
\[\mathcal A(p,λ)[X,Y] = -b(p,λ),\qquad \text{where } X ∈ T_p\mathcal M, Y ∈ ℝ^n\]
where $\mathcal A: T_{(p,λ)}\mathcal N → T_{(p,λ)}\mathcal N$ is a linear operator on $T_{(p,λ)}\mathcal N = T_p\mathcal M × ℝ^n$ given by
\[\mathcal A(p,λ)[X,Y] = \begin{pmatrix} \operatorname{Hess}_p\mathcal L(p, μ, λ)[X] + \displaystyle\sum_{i=1}^m \frac{μ_i}{s_i}⟨\operatorname{grad} g_i(p), X⟩\operatorname{grad} g_i(p) + \displaystyle\sum_{j=1}^n Y_j \operatorname{grad} h_j(p) \\ \Bigl( ⟨\operatorname{grad} h_j(p), X⟩ \Bigr)_{j=1}^n \end{pmatrix}\]
Fields
cmotheConstrainedManifoldObjectiveμ::Vthe vector in $ℝ^m$ of coefficients for the inequality constraintss::Vthe vector in $ℝ^m$ of slack variablesβ::Rthe barrier parameter $β∈ℝ$
Constructor
CondensedKKTVectorFieldJacobian(cmo, μ, s, β)Manopt.KKTVectorField — TypeKKTVectorField{O<:ConstrainedManifoldObjective}Implement the vectorfield $F$ KKT-conditions, inlcuding a slack variable for the inequality constraints.
Given the LagrangianCost
\[\mathcal L(p; μ, λ) = f(p) + \sum_{i=1}^m μ_ig_i(p) + \sum_{j=1}^n λ_jh_j(p)\]
\[\operatorname{grad}\mathcal L(p, μ, λ) = \operatorname{grad}f(p) + \sum_{j=1}^n λ_j \operatorname{grad} h_j(p) + \sum_{i=1}^m μ_i \operatorname{grad} g_i(p),\]
and introducing the slack variables $s=-g(p) ∈ ℝ^m$ the vector field is given by
\[F(p, μ, λ, s) = \begin{pmatrix} \operatorname{grad}_p \mathcal L(p, μ, λ)\\ g(p) + s\\ h(p)\\ μ ⊙ s \end{pmatrix}, \text{ where } p \in \mathcal M, μ, s \in ℝ^m\text{ and } λ \in ℝ^n,\]
where $⊙$ denotes the Hadamard (or elementwise) product
Fields
cmotheConstrainedManifoldObjective
While the point p is arbitrary and usually not needed, it serves as internal memory in the computations. Furthermore Both fields together also calrify the product manifold structure to use.
Constructor
KKTVectorField(cmo::ConstrainedManifoldObjective)Example
Define F = KKTVectorField(cmo) for some ConstrainedManifoldObjective cmo and let N be the product manifold of $\mathcal M×ℝ^m×ℝ^n×ℝ^m$. Then, you can call this cost as F(N, q) or as the in-place variant F(N, Y, q), where q is a point on N and Y is a tangent vector at q for the result.
Manopt.KKTVectorFieldJacobian — TypeKKTVectorFieldJacobian{O<:ConstrainedManifoldObjective}Implement the Jacobian of the vector field $F$ of the KKT-conditions, inlcuding a slack variable for the inequality constraints, see KKTVectorField and KKTVectorFieldAdjointJacobian..
\[\operatorname{J} F(p, μ, λ, s)[X, Y, Z, W] = \begin{pmatrix} \operatorname{Hess}_p \mathcal L(p, μ, λ)[X] + \displaystyle\sum_{i=1}^m Y_i \operatorname{grad} g_i(p) + \displaystyle\sum_{j=1}^n Z_j \operatorname{grad} h_j(p)\\ \Bigl( ⟨\operatorname{grad} g_i(p), X⟩ + W_i\Bigr)_{i=1}^m\\ \Bigl( ⟨\operatorname{grad} h_j(p), X⟩ \Bigr)_{j=1}^n\\ μ ⊙ W + s ⊙ Y \end{pmatrix},\]
where $⊙$ denotes the Hadamard (or elementwise) product
See also the LagrangianHessian $\operatorname{Hess}_p \mathcal L(p, μ, λ)[X]$.
Fields
cmotheConstrainedManifoldObjective
Constructor
KKTVectorFieldJacobian(cmo::ConstrainedManifoldObjective)Generate the Jacobian of the KKT vector field related to some ConstrainedManifoldObjective cmo.
Example
Define JF = KKTVectorFieldJacobian(cmo) for some ConstrainedManifoldObjective cmo and let N be the product manifold of $\mathcal M×ℝ^m×ℝ^n×ℝ^m$. Then, you can call this cost as JF(N, q, Y) or as the in-place variant JF(N, Z, q, Y), where q is a point on N and Y and Z are a tangent vector at q.
Manopt.KKTVectorFieldAdjointJacobian — TypeKKTVectorFieldAdjointJacobian{O<:ConstrainedManifoldObjective}Implement the Adjoint of the Jacobian of the vector field $F$ of the KKT-conditions, inlcuding a slack variable for the inequality constraints, see KKTVectorField and KKTVectorFieldJacobian.
\[\operatorname{J}^* F(p, μ, λ, s)[X, Y, Z, W] = \begin{pmatrix} \operatorname{Hess}_p \mathcal L(p, μ, λ)[X] + \displaystyle\sum_{i=1}^m Y_i \operatorname{grad} g_i(p) + \displaystyle\sum_{j=1}^n Z_j \operatorname{grad} h_j(p)\\ \Bigl( ⟨\operatorname{grad} g_i(p), X⟩ + s_iW_i\Bigr)_{i=1}^m\\ \Bigl( ⟨\operatorname{grad} h_j(p), X⟩ \Bigr)_{j=1}^n\\ μ ⊙ W + Y \end{pmatrix},\]
where $⊙$ denotes the Hadamard (or elementwise) product
See also the LagrangianHessian $\operatorname{Hess}_p \mathcal L(p, μ, λ)[X]$.
Fields
cmotheConstrainedManifoldObjective
Constructor
KKTVectorFieldAdjointJacobian(cmo::ConstrainedManifoldObjective)Generate the Adjoint Jacobian of the KKT vector field related to some ConstrainedManifoldObjective cmo.
Example
Define AdJF = KKTVectorFieldAdjointJacobian(cmo) for some ConstrainedManifoldObjective cmo and let N be the product manifold of $\mathcal M×ℝ^m×ℝ^n×ℝ^m$. Then, you can call this cost as AdJF(N, q, Y) or as the in-place variant AdJF(N, Z, q, Y), where q is a point on N and Y and Z are a tangent vector at q.
Manopt.KKTVectorFieldNormSq — TypeKKTVectorFieldNormSq{O<:ConstrainedManifoldObjective}Implement the square of the norm of the vectorfield $F$ of the KKT-conditions, inlcuding a slack variable for the inequality constraints, see KKTVectorField, where this functor applies the norm to. In [LY24] this is called the merit function.
Fields
cmotheConstrainedManifoldObjective
Constructor
KKTVectorFieldNormSq(cmo::ConstrainedManifoldObjective)Example
Define f = KKTVectorFieldNormSq(cmo) for some ConstrainedManifoldObjective cmo and let N be the product manifold of $\mathcal M×ℝ^m×ℝ^n×ℝ^m$. Then, you can call this cost as f(N, q), where q is a point on N.
Manopt.KKTVectorFieldNormSqGradient — TypeKKTVectorFieldNormSqGradient{O<:ConstrainedManifoldObjective}Compute the gradient of the KKTVectorFieldNormSq $φ(p,μ,λ,s) = \lVert F(p,μ,λ,s)\rVert^2$, that is of the norm squared of the KKTVectorField $F$.
This is given in [LY24] as the gradient of their merit function, which we can write with the adjoint $J^*$ of the Jacobian
\[\operatorname{grad} φ = 2\operatorname{J}^* F(p, μ, λ, s)[F(p, μ, λ, s)],\]
and hence is computed with KKTVectorFieldAdjointJacobian and KKTVectorField.
For completeness, the gradient reads, using the LagrangianGradient $L = \operatorname{grad}_p \mathcal L(p,μ,λ) ∈ T_p\mathcal M$, for a shorthand of the first component of $F$, as
\[\operatorname{grad} φ = 2 \begin{pmatrix} \operatorname{grad}_p \mathcal L(p,μ,λ)[L] + (g_i(p) + s_i)\operatorname{grad} g_i(p) + h_j(p)\operatorname{grad} h_j(p)\\ \Bigl( ⟨\operatorname{grad} g_i(p), L⟩ + s_i\Bigr)_{i=1}^m + μ ⊙ s ⊙ s\\ \Bigl( ⟨\operatorname{grad} h_j(p), L⟩ \Bigr)_{j=1}^n\\ g + s + μ ⊙ μ ⊙ s \end{pmatrix},\]
where $⊙$ denotes the Hadamard (or elementwise) product.
Fields
cmotheConstrainedManifoldObjective
Constructor
KKTVectorFieldNormSqGradient(cmo::ConstrainedManifoldObjective)Example
Define grad_f = KKTVectorFieldNormSqGradient(cmo) for some ConstrainedManifoldObjective cmo and let N be the product manifold of $\mathcal M×ℝ^m×ℝ^n×ℝ^m$. Then, you can call this cost as grad_f(N, q) or as the in-place variant grad_f(N, Y, q), where q is a point on N and Y is a tangent vector at q returning the resulting gradient at.
Helpers
Manopt.InteriorPointCentralityCondition — TypeInteriorPointCentralityCondition{CO,R}A functor to check the centrality condition.
In order to obtain a step in the linesearch performed within the interior_point_Newton, Section 6 of [LY24] propose the following additional conditions to hold inspired by the Euclidean case described in Section 6 [ETTZ96]:
For a given ConstrainedManifoldObjective assume consider the KKTVectorField $F$, that is we are at a point $q = (p, λ, μ, s)$ on $\mathcal M × ℝ^m × ℝ^n × ℝ^m$and a search direction $V = (X, Y, Z, W)$.
Then, let
\[τ_1 = \frac{m⋅\min\{ μ ⊙ s\}}{μ^{\mathrm{T}}s} \quad\text{ and }\quad τ_2 = \frac{μ^{\mathrm{T}}s}{\lVert F(q) \rVert}\]
where $⊙$ denotes the Hadamard (or elementwise) product.
For a new candidate $q(α) = \bigl(p(α), λ(α), μ(α), s(α)\bigr) := (\operatorname{retr}_p(αX), λ+αY, μ+αZ, s+αW)$, we then define two functions
\[c_1(α) = \min\{ μ(α) ⊙ s(α) \} - \frac{γτ_1 μ(α)^{\mathrm{T}}s(α)}{m} \quad\text{ and }\quad c_2(α) = μ(α)^{\mathrm{T}}s(α) – γτ_2 \lVert F(q(α)) \rVert.\]
While the paper now states that the (Armijo) linesearch starts at a point $\tilde α$, it is easier to include the condition that $c_1(α) ≥ 0$ and $c_2(α) ≥ 0$ into the linesearch as well.
The functor InteriorPointCentralityCondition(cmo, γ, μ, s, normKKT)(N,qα) defined here evaluates this condition and returns true if both $c_1$ and $c_2$ are nonnegative.
Fields
cmo: aConstrainedManifoldObjectiveγ: a constantτ1,τ2: the constants given in the formula.
Constructor
InteriorPointCentralityCondition(cmo, γ)
InteriorPointCentralityCondition(cmo, γ, τ1, τ2)Initialise the centrality conditions. The parameters τ1, τ2 are initialise to zero if not provided.
Besides get_manopt_parameter for all three constants, and set_manopt_parameter! for $γ$, to update $τ_1$ and $τ_2$, call set_manopt_parameter(ipcc, :τ, N, q) to update both $τ_1$ and $τ_2$ according to the formulae above.
Manopt.calculate_σ — Functioncalculate_σ(M, cmo, p, μ, λ, s)Compute the new $σ$ factor for the barrier parameter in interior_point_Newton as
\[\min\{\frac{1}{2}, \lVert F(p; μ, λ, s)\rVert^{\frac{1}{2}} \},\]
where $F$ is the KKT vector field, hence the KKTVectorFieldNormSq is used.
Keyword arguments
vector_space=Rna function that, given an integer, returns the manifold to be used for the vector space components $ℝ^m,ℝ^n$Nthe manifold $\mathcal M × ℝ^m × ℝ^n × ℝ^m$ the vector field lives on (generated usingvector_space)qprovide memory onNfor interims computations
Additional stopping criteria
Manopt.StopWhenKKTResidualLess — TypeStopWhenKKTResidualLess <: StoppingCriterionStop when the KKT residual
r^2
= \lVert \operatorname{grad}_p \mathcal L(p, μ, λ) \rVert^2
+ \sum_{i=1}^m [μ_i]_{-}^2 + [g_i(p)]_+^2 + \lvert \mu_ig_i(p)^2
+ \sum_{j=1}^n \lvert h_i(p)\rvert^2.is less than a given threshold $r < ε$. We use $[v]_+ = \max\{0,v\}$ and $[v]_- = \min\{0,t\}$ for the positive and negative part of $v$, respectively
Fields
ε: a thresholdat_iteration:
References
- [ETTZ96]
- A. S. El-Bakry, R. A. Tapia, T. Tsuchiya and Y. Zhang. On the formulation and theory of the Newton interior-point method for nonlinear programming. Journal of Optimization Theory and Applications 89, 507–541 (1996).
- [LY24]
- Z. Lai and A. Yoshise. Riemannian Interior Point Methods for Constrained Optimization on Manifolds. Journal of Optimization Theory and Applications 201, 433–469 (2024), arXiv:2203.09762.