Primal-dual Riemannian semismooth Newton algorithm
The Primal-dual Riemannian semismooth Newton Algorithm is a second-order method derived from the ChambollePock
.
The aim is to solve an optimization problem on a manifold with a cost function of the form
\[F(p) + G(Λ(p)),\]
where $F:\mathcal M → \overline{ℝ}$, $G:\mathcal N → \overline{ℝ}$, and $Λ:\mathcal M →\mathcal N$. If the manifolds $\mathcal M$ or $\mathcal N$ are not Hadamard, it has to be considered locally only, that is on geodesically convex sets $\mathcal C \subset \mathcal M$ and $\mathcal D \subset\mathcal N$ such that $Λ(\mathcal C) \subset \mathcal D$.
The algorithm comes down to applying the Riemannian semismooth Newton method to the rewritten primal-dual optimality conditions. Define the vector field $X: \mathcal{M} \times \mathcal{T}_{n}^{*} \mathcal{N} \rightarrow \mathcal{T} \mathcal{M} \times \mathcal{T}_{n}^{*} \mathcal{N}$ as
\[X\left(p, \xi_{n}\right):=\left(\begin{array}{c} -\log _{p} \operatorname{prox}_{\sigma F}\left(\exp _{p}\left(\mathcal{P}_{p \leftarrow m}\left(-\sigma\left(D_{m} \Lambda\right)^{*}\left[\mathcal{P}_{\Lambda(m) \leftarrow n} \xi_{n}\right]\right)^{\sharp}\right)\right) \\ \xi_{n}-\operatorname{prox}_{\tau G_{n}^{*}}\left(\xi_{n}+\tau\left(\mathcal{P}_{n \leftarrow \Lambda(m)} D_{m} \Lambda\left[\log _{m} p\right]\right)^{\flat}\right) \end{array}\right)\]
and solve for $X(p,ξ_{n})=0$.
Given base points $m∈\mathcal C$, $n=Λ(m)∈\mathcal D$, initial primal and dual values $p^{(0)} ∈\mathcal C$, $ξ_{n}^{(0)} ∈ \mathcal T_{n}^{*}\mathcal N$, and primal and dual step sizes $\sigma$, $\tau$.
The algorithms performs the steps $k=1,…,$ (until a StoppingCriterion
is reached)
- Choose any element
\[V^{(k)} ∈ ∂_C X(p^{(k)},ξ_n^{(k)})\]
of the Clarke generalized covariant derivative - Solve
\[V^{(k)} [(d_p^{(k)}, d_n^{(k)})] = - X(p^{(k)},ξ_n^{(k)})\]
in the vector space $\mathcal{T}_{p^{(k)}} \mathcal{M} \times \mathcal{T}_{n}^{*} \mathcal{N}$ - Update
\[p^{(k+1)} := \exp_{p^{(k)}}(d_p^{(k)})\]
and\[ξ_n^{(k+1)} := ξ_n^{(k)} + d_n^{(k)}\]
Furthermore you can exchange the exponential map, the logarithmic map, and the parallel transport by a retraction, an inverse retraction and a vector transport.
Finally you can also update the base points $m$ and $n$ during the iterations. This introduces a few additional vector transports. The same holds for the case that $Λ(m^{(k)})\neq n^{(k)}$ at some point. All these cases are covered in the algorithm.
Manopt.primal_dual_semismooth_Newton
— Functionprimal_dual_semismooth_Newton(M, N, cost, p, X, m, n, prox_F, diff_prox_F, prox_G_dual, diff_prox_dual_G, linearized_operator, adjoint_linearized_operator)
Perform the Primal-Dual Riemannian semismooth Newton algorithm.
Given a cost
function $\mathcal E: \mathcal M → \overline{ℝ}$ of the form
\[\mathcal E(p) = F(p) + G( Λ(p) ),\]
where $F: \mathcal M → \overline{ℝ}$, $G: \mathcal N → \overline{ℝ}$, and $Λ: \mathcal M → \mathcal N$. The remaining input parameters are
p, X
: primal and dual start points $p∈\mathcal M$ and $X ∈ T_n\mathcal N$m,n
: base points on $\mathcal M$ and `\mathcal N
, respectively.linearized_forward_operator
: the linearization $DΛ(⋅)[⋅]$ of the operator $Λ(⋅)$.adjoint_linearized_operator
: the adjoint $DΛ^*$ of the linearized operator $DΛ(m): T_{m}\mathcal M → T_{Λ(m)}\mathcal N$prox_F, prox_G_Dual
: the proximal maps of $F$ and $G^\ast_n$diff_prox_F, diff_prox_dual_G
: the (Clarke Generalized) differentials of the proximal maps of $F$ and $G^\ast_n$
For more details on the algorithm, see [DL21].
Keyword arguments
dual_stepsize=1/sqrt(8)
: proximal parameter of the dual proxevaluation=
AllocatingEvaluation
()
: specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation
) or whether they modify their input argument to return the result therein (InplaceEvaluation
). Since usually the first argument is the manifold, the modified argument is the second.inverse_retraction_method=
default_inverse_retraction_method
(M, typeof(p))
: an inverse retraction $\operatorname{retr}^{-1}$ to use, see the section on retractions and their inversesΛ=missing
: the exact operator, that is required ifΛ(m)=n
does not hold;missing
indicates, that the forward operator is exact.primal_stepsize=1/sqrt(8)
: proximal parameter of the primal proxreg_param=1e-5
: regularisation parameter for the Newton matrix Note that this changes the arguments theforward_operator
is called.retraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsstopping_criterion=
StopAfterIteration
(50)
: a functor indicating that the stopping criterion is fulfilledupdate_primal_base=missing
: function to updatem
(identity by default/missing)update_dual_base=missing
: function to updaten
(identity by default/missing)vector_transport_method=
default_vector_transport_method
(M, typeof(p))
: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
All other keyword arguments are passed to decorate_state!
for state decorators or decorate_objective!
for objective decorators, respectively.
Output
The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return
for details, especially the return_state=
keyword.
Manopt.primal_dual_semismooth_Newton!
— Functionprimal_dual_semismooth_Newton(M, N, cost, p, X, m, n, prox_F, diff_prox_F, prox_G_dual, diff_prox_dual_G, linearized_operator, adjoint_linearized_operator)
Perform the Primal-Dual Riemannian semismooth Newton algorithm.
Given a cost
function $\mathcal E: \mathcal M → \overline{ℝ}$ of the form
\[\mathcal E(p) = F(p) + G( Λ(p) ),\]
where $F: \mathcal M → \overline{ℝ}$, $G: \mathcal N → \overline{ℝ}$, and $Λ: \mathcal M → \mathcal N$. The remaining input parameters are
p, X
: primal and dual start points $p∈\mathcal M$ and $X ∈ T_n\mathcal N$m,n
: base points on $\mathcal M$ and `\mathcal N
, respectively.linearized_forward_operator
: the linearization $DΛ(⋅)[⋅]$ of the operator $Λ(⋅)$.adjoint_linearized_operator
: the adjoint $DΛ^*$ of the linearized operator $DΛ(m): T_{m}\mathcal M → T_{Λ(m)}\mathcal N$prox_F, prox_G_Dual
: the proximal maps of $F$ and $G^\ast_n$diff_prox_F, diff_prox_dual_G
: the (Clarke Generalized) differentials of the proximal maps of $F$ and $G^\ast_n$
For more details on the algorithm, see [DL21].
Keyword arguments
dual_stepsize=1/sqrt(8)
: proximal parameter of the dual proxevaluation=
AllocatingEvaluation
()
: specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation
) or whether they modify their input argument to return the result therein (InplaceEvaluation
). Since usually the first argument is the manifold, the modified argument is the second.inverse_retraction_method=
default_inverse_retraction_method
(M, typeof(p))
: an inverse retraction $\operatorname{retr}^{-1}$ to use, see the section on retractions and their inversesΛ=missing
: the exact operator, that is required ifΛ(m)=n
does not hold;missing
indicates, that the forward operator is exact.primal_stepsize=1/sqrt(8)
: proximal parameter of the primal proxreg_param=1e-5
: regularisation parameter for the Newton matrix Note that this changes the arguments theforward_operator
is called.retraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsstopping_criterion=
StopAfterIteration
(50)
: a functor indicating that the stopping criterion is fulfilledupdate_primal_base=missing
: function to updatem
(identity by default/missing)update_dual_base=missing
: function to updaten
(identity by default/missing)vector_transport_method=
default_vector_transport_method
(M, typeof(p))
: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
All other keyword arguments are passed to decorate_state!
for state decorators or decorate_objective!
for objective decorators, respectively.
Output
The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return
for details, especially the return_state=
keyword.
State
Manopt.PrimalDualSemismoothNewtonState
— TypePrimalDualSemismoothNewtonState <: AbstractPrimalDualSolverState
Fields
m::P
: a point on the manifold $\mathcal M$n::Q
: a point on the manifold $\mathcal N$p::P
: a point on the manifold $\mathcal M$storing the current iterateX::T
: a tangent vector at the point $p$ on the manifold $\mathcal M$primal_stepsize::Float64
: proximal parameter of the primal proxdual_stepsize::Float64
: proximal parameter of the dual proxreg_param::Float64
: regularisation parameter for the Newton matrixstop::StoppingCriterion
: a functor indicating that the stopping criterion is fulfilledupdate_primal_base
: function to update the primal baseupdate_dual_base
: function to update the dual baseinverse_retraction_method::AbstractInverseRetractionMethod
: an inverse retraction $\operatorname{retr}^{-1}$ to use, see the section on retractions and their inversesretraction_method::AbstractRetractionMethod
: a retraction $\operatorname{retr}$ to use, see the section on retractionsvector_transport_method::AbstractVectorTransportMethodP
: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
where for the update functions a AbstractManoptProblem
amp
, AbstractManoptSolverState
ams
and the current iterate i
are the arguments. If you activate these to be different from the default identity, you have to provide p.Λ
for the algorithm to work (which might be missing
).
Constructor
PrimalDualSemismoothNewtonState(M::AbstractManifold; kwargs...)
Generate a state for the primal_dual_semismooth_Newton
.
Keyword arguments
m=
rand
(M)
n=
rand
(N)
p=
rand
(M)
X=
zero_vector
(M, p)
primal_stepsize=1/sqrt(8)
dual_stepsize=1/sqrt(8)
reg_param=1e-5
update_primal_base=(amp, ams, k) -> o.m
update_dual_base=(amp, ams, k) -> o.n
retraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsinverse_retraction_method=
default_inverse_retraction_method
(M, typeof(p))
: an inverse retraction $\operatorname{retr}^{-1}$ to use, see the section on retractions and their inversesstopping_criterion=
[
StopAfterIteration](@ref)
(50)`: a functor indicating that the stopping criterion is fulfilledvector_transport_method=
default_vector_transport_method
(M, typeof(p))
: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Technical details
The primal_dual_semismooth_Newton
solver requires the following functions of a manifold to be available for both the manifold $\mathcal M$and $\mathcal N$
- A
retract!
(M, q, p, X)
; it is recommended to set thedefault_retraction_method
to a favourite retraction. If this default is set, aretraction_method=
does not have to be specified. - An
inverse_retract!
(M, X, p, q)
; it is recommended to set thedefault_inverse_retraction_method
to a favourite retraction. If this default is set, ainverse_retraction_method=
does not have to be specified. - A
vector_transport_to!
M, Y, p, X, q)
; it is recommended to set thedefault_vector_transport_method
to a favourite retraction. If this default is set, avector_transport_method=
does not have to be specified. - A
copyto!
(M, q, p)
andcopy
(M,p)
for points. - A
get_basis
for theDefaultOrthonormalBasis
on $\mathcal M$ exp
andlog
(on $\mathcal M$)- A
DiagonalizingOrthonormalBasis
to compute the differentials of the exponential and logarithmic map - Tangent vectors storing the social and cognitive vectors are initialized calling
zero_vector
(M,p)
.
Literature
- [DL21]
- W. Diepeveen and J. Lellmann. An Inexact Semismooth Newton Method on Riemannian Manifolds with Application to Duality-Based Total Variation Denoising. SIAM Journal on Imaging Sciences 14, 1565–1600 (2021), arXiv:2102.10309.