Primal-dual Riemannian semismooth Newton algorithm
The Primal-dual Riemannian semismooth Newton Algorithm is a second-order method derived from the ChambollePock.
The aim is to solve an optimization problem on a manifold with a cost function of the form
\[F(p) + G(Λ(p)),\]
where $F:\mathcal M → \overline{ℝ}$, $G:\mathcal N → \overline{ℝ}$, and $Λ:\mathcal M →\mathcal N$. If the manifolds $\mathcal M$ or $\mathcal N$ are not Hadamard, it has to be considered locally only, that is on geodesically convex sets $\mathcal C \subset \mathcal M$ and $\mathcal D \subset\mathcal N$ such that $Λ(\mathcal C) \subset \mathcal D$.
The algorithm comes down to applying the Riemannian semismooth Newton method to the rewritten primal-dual optimality conditions. Define the vector field $X: \mathcal{M} \times \mathcal{T}_{n}^{*} \mathcal{N} \rightarrow \mathcal{T} \mathcal{M} \times \mathcal{T}_{n}^{*} \mathcal{N}$ as
\[X\left(p, \xi_{n}\right):=\left(\begin{array}{c} -\log _{p} \operatorname{prox}_{\sigma F}\left(\exp _{p}\left(\mathcal{P}_{p \leftarrow m}\left(-\sigma\left(D_{m} \Lambda\right)^{*}\left[\mathcal{P}_{\Lambda(m) \leftarrow n} \xi_{n}\right]\right)^{\sharp}\right)\right) \\ \xi_{n}-\operatorname{prox}_{\tau G_{n}^{*}}\left(\xi_{n}+\tau\left(\mathcal{P}_{n \leftarrow \Lambda(m)} D_{m} \Lambda\left[\log _{m} p\right]\right)^{\flat}\right) \end{array}\right)\]
and solve for $X(p,ξ_{n})=0$.
Given base points $m∈\mathcal C$, $n=Λ(m)∈\mathcal D$, initial primal and dual values $p^{(0)} ∈\mathcal C$, $ξ_{n}^{(0)} ∈ \mathcal T_{n}^{*}\mathcal N$, and primal and dual step sizes $\sigma$, $\tau$.
The algorithms performs the steps $k=1,…,$ (until a StoppingCriterion is reached)
- Choose any element
\[V^{(k)} ∈ ∂_C X(p^{(k)},ξ_n^{(k)})\]
of the Clarke generalized covariant derivative - Solve
\[V^{(k)} [(d_p^{(k)}, d_n^{(k)})] = - X(p^{(k)},ξ_n^{(k)})\]
in the vector space $\mathcal{T}_{p^{(k)}} \mathcal{M} \times \mathcal{T}_{n}^{*} \mathcal{N}$ - Update
\[p^{(k+1)} := \exp_{p^{(k)}}(d_p^{(k)})\]
and\[ξ_n^{(k+1)} := ξ_n^{(k)} + d_n^{(k)}\]
Furthermore you can exchange the exponential map, the logarithmic map, and the parallel transport by a retraction, an inverse retraction and a vector transport.
Finally you can also update the base points $m$ and $n$ during the iterations. This introduces a few additional vector transports. The same holds for the case that $Λ(m^{(k)})\neq n^{(k)}$ at some point. All these cases are covered in the algorithm.
Manopt.primal_dual_semismooth_Newton — Functionprimal_dual_semismooth_Newton(M, N, cost, p, X, m, n, prox_F, diff_prox_F, prox_G_dual, diff_prox_dual_G, linearized_operator, adjoint_linearized_operator)Perform the Primal-Dual Riemannian semismooth Newton algorithm.
Given a cost function $\mathcal E: \mathcal M → \overline{ℝ}$ of the form
\[\mathcal E(p) = F(p) + G( Λ(p) ),\]
where $F: \mathcal M → \overline{ℝ}$, $G: \mathcal N → \overline{ℝ}$, and $\Lambda: \mathcal M → \mathcal N$. The remaining input parameters are
p, X: primal and dual start points $p∈\mathcal M$ and $X ∈ T_n\mathcal N$m,n: base points on $\mathcal M$ and $\mathcal N$, respectively.linearized_forward_operator: the linearization $DΛ(⋅)[⋅]$ of the operator $Λ(⋅)$.adjoint_linearized_operator: the adjoint $DΛ^*$ of the linearized operator $DΛ(m): T_{m}\mathcal M → T_{Λ(m)}\mathcal N$prox_F, prox_G_Dual: the proximal maps of $F$ and $G^\ast_n$diff_prox_F, diff_prox_dual_G: the (Clarke Generalized) differentials of the proximal maps of $F$ and $G^\ast_n$
For more details on the algorithm, see [DL21].
Optional parameters
primal_stepsize: (1/sqrt(8)) proximal parameter of the primal proxΛ: (missing) the exact operator, that is required ifΛ(m)=ndoes not hold;missingindicates, that the forward operator is exact.dual_stepsize: (1/sqrt(8)) proximal parameter of the dual proxreg_param: (1e-5) regularisation parameter for the Newton matrix Note that this changes the arguments theforward_operatoris called.stopping_criterion: (stopAtIteration(50)) aStoppingCriterionupdate_primal_base: (missing) function to updatem(identity by default/missing)update_dual_base: (missing) function to updaten(identity by default/missing)retraction_method: (default_retraction_method(M, typeof(p))) the retraction to useinverse_retraction_method: (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use.vector_transport_method: (default_vector_transport_method(M, typeof(p))) a vector transport to use
Output
the obtained (approximate) minimizer $p^*$, see get_solver_return for details
Manopt.primal_dual_semismooth_Newton! — Functionprimal_dual_semismooth_Newton(M, N, cost, x0, ξ0, m, n, prox_F, diff_prox_F, prox_G_dual, diff_prox_G_dual, linearized_forward_operator, adjoint_linearized_operator)Perform the Riemannian Primal-dual Riemannian semismooth Newton algorithm in place of x, ξ, and potentially m, n if they are not fixed. See primal_dual_semismooth_Newton for details and optional parameters.
State
Manopt.PrimalDualSemismoothNewtonState — TypePrimalDualSemismoothNewtonState <: AbstractPrimalDualSolverStatem: base point on $\mathcal M$n: base point on $\mathcal N$x: an initial point on $x^{(0)} ∈ \mathcal M$ (and its previous iterate)ξ: an initial tangent vector $\xi^{(0)} ∈ T_{n}^*\mathcal N$ (and its previous iterate)primal_stepsize: (1/sqrt(8)) proximal parameter of the primal proxdual_stepsize: (1/sqrt(8)) proximal parameter of the dual proxreg_param: (1e-5) regularisation parameter for the Newton matrixstop: aStoppingCriterionupdate_primal_base: (( amp, ams, i) -> o.m) function to update the primal baseupdate_dual_base: ((amp, ams, i) -> o.n) function to update the dual baseretraction_method: (default_retraction_method(M, typeof(p))) the retraction to useinverse_retraction_method: (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use.vector_transport_method: (default_vector_transport_method(M, typeof(p))) a vector transport to use
where for the update functions a AbstractManoptProblem amp, AbstractManoptSolverState ams and the current iterate i are the arguments. If you activate these to be different from the default identity, you have to provide p.Λ for the algorithm to work (which might be missing).
Constructor
PrimalDualSemismoothNewtonState(M::AbstractManifold,
m::P, n::Q, x::P, ξ::T, primal_stepsize::Float64, dual_stepsize::Float64, reg_param::Float64;
stopping_criterion::StoppingCriterion = StopAfterIteration(50),
update_primal_base::Union{Function,Missing} = missing,
update_dual_base::Union{Function,Missing} = missing,
retraction_method = default_retraction_method(M, typeof(p)),
inverse_retraction_method = default_inverse_retraction_method(M, typeof(p)),
vector_transport_method = default_vector_transport_method(M, typeof(p)),
)Technical details
The primal_dual_semismooth_Newton solver requires the following functions of a manifold to be available for both the manifold $\mathcal M$and $\mathcal N$
- A
retract!(M, q, p, X); it is recommended to set thedefault_retraction_methodto a favourite retraction. If this default is set, aretraction_method=does not have to be specified. - An
inverse_retract!(M, X, p, q); it is recommended to set thedefault_inverse_retraction_methodto a favourite retraction. If this default is set, ainverse_retraction_method=does not have to be specified. - A
vector_transport_to!M, Y, p, X, q); it is recommended to set thedefault_vector_transport_methodto a favourite retraction. If this default is set, avector_transport_method=does not have to be specified. - A `copyto!
(M, q, p)andcopy(M,p)for points. - A
get_basisfor theDefaultOrthonormalBasison $\mathcal M$ expandlog(on $\mathcal M$)- A
DiagonalizingOrthonormalBasisto compute the differentials of the exponential and logarithmic map - Tangent vectors storing the social and cognitive vectors are initialized calling
zero_vector(M,p).
Literature
- [DL21]
- W. Diepeveen and J. Lellmann. An Inexact Semismooth Newton Method on Riemannian Manifolds with Application to Duality-Based Total Variation Denoising. SIAM Journal on Imaging Sciences 14, 1565–1600 (2021), arXiv:2102.10309.