# Difference of convex

## Difference of convex algorithm

Manopt.difference_of_convex_algorithmFunction
difference_of_convex_algorithm(M, f, g, ∂h, p=rand(M); kwargs...)
difference_of_convex_algorithm(M, mdco, p; kwargs...)

Compute the difference of convex algorithm Bergmann, Ferreira, Santos, Souza, preprint, 2023 to minimize

$$$\operatorname*{arg\,min}_{p∈\mathcal M}\ g(p) - h(p)$$$

where you need to provide $f(p) = g(p) - h(p)$, $g$ and the subdifferential $∂h$ of $h$.

This algorithm performs the following steps given a start point p= $p^{(0)}$. Then repeat for $k=0,1,\ldots$

1. Take $X^{(k)} ∈ ∂h(p^{(k)})$
2. Set the next iterate to the solution of the subproblem
$$$p^{(k+1)} \in \operatorname*{argmin}_{q\in \mathcal M} g(q) - ⟨X^{(k)}, \log_{p^{(k)}}q⟩$$$

until the stopping_criterion is fulfilled.

Optional parameters

if you specify the ManifoldDifferenceOfConvexObjective mdco, additionally

• g - (nothing) specify the function g If specified, a subsolver is automatically set up.

While there are several parameters for a sub solver, the easiest is to provide the function grad_g=, such that together with the mandatory function g a default cost and gradient can be generated and passed to a default subsolver. Hence the easiest example call looks like

difference_of_convex_algorithm(M, f, g, grad_h, p; grad_g=grad_g)

Optional parameters for the sub problem

• sub_cost - (LinearizedDCCost(g, p, initial_vector)) a cost to be used within the default sub_problem Use this if you have a more efficient version than the default that is built using g from above.
• sub_grad - (LinearizedDCGrad(grad_g, p, initial_vector; evaluation=evaluation) gradient to be used within the default sub_problem. This is generated by default when grad_g is provided. You can specify your own by overwriting this keyword.
• sub_hess – (a finite difference approximation by default) specify a Hessian of the subproblem, which the default solver, see sub_state needs
• sub_kwargs - ([]) pass keyword arguments to the sub_state, in form of a Dict(:kwname=>value), unless you set the sub_state directly.
• sub_objective - (a gradient or hessian objective based on the last 3 keywords) provide the objective used within sub_problem (if that is not specified by the user)
• sub_problem - (DefaultManoptProblem(M, sub_objective) specify a manopt problem for the sub-solver runs. You can also provide a function for a closed form solution. Then evaluation= is taken into account for the form of this function.
• sub_state - (TrustRegionsState by default, requires sub_hessian to be provided; decorated with sub_kwargs). Choose the solver by specifying a solver state to solve the sub_problem if the sub_problem if a function (i.e. a closed form solution), this is set to evaluation and can be changed to the evaluation type of the closed form solution accordingly.
• sub_stopping_criterion - (StopAfterIteration(300) |StopWhenStepsizeLess(1e-9) |StopWhenGradientNormLess(1e-9)) a stopping criterion used withing the default sub_state=
• sub_stepsize - (ArmijoLinesearch(M)) specify a step size used within the sub_state

...all others are passed on to decorate the inner DifferenceOfConvexState.

Output

the obtained (approximate) minimizer $p^*$, see get_solver_return for details

source
Manopt.difference_of_convex_algorithm!Function
difference_of_convex_algorithm!(M, f, g, ∂h, p; kwargs...)
difference_of_convex_algorithm!(M, mdco, p; kwargs...)

Run the difference of convex algorithm and perform the steps in place of p. See difference_of_convex_algorithm for more details.

if you specify the ManifoldDifferenceOfConvexObjective mdco, the g is a keyword argument.

source

## Difference of convex proximal point

Manopt.difference_of_convex_proximal_pointFunction
difference_of_convex_proximal_point(M, grad_h, p=rand(M); kwargs...)
difference_of_convex_proximal_point(M, mdcpo, p=rand(M); kwargs...)

Compute the difference of convex proximal point algorithm Souza, Oliveira, J. Glob. Optim., 2015 to minimize

$$$\operatorname*{arg\,min}_{p∈\mathcal M} g(p) - h(p)$$$

where you have to provide the (sub) gradient $∂h$ of $h$ and either

• the proximal map $\operatorname{prox}_{\lambda g}$ of g as a function prox_g(M, λ, p) or prox_g(M, q, λ, p)
• the functions g and grad_g to compute the proximal map using a sub solver
• your own sub-solver, see optional keywords below

This algorithm performs the following steps given a start point p= $p^{(0)}$. Then repeat for $k=0,1,\ldots$

1. $X^{(k)} ∈ \operatorname{grad} h(p^{(k)})$
2. $q^{(k)} = \operatorname{retr}_{p^{(k)}}(λ_kX^{(k)})$
3. $r^{(k)} = \operatorname{prox}_{λ_kg}(q^{(k)})$
4. $X^{(k)} = \operatorname{retr}^{-1}_{p^{(k)}}(r^{(k)})$
5. Compute a stepsize $s_k$ and
6. set $p^{(k+1)} = \operatorname{retr}_{p^{(k)}}(s_kX^{(k)})$.

until the stopping_criterion is fulfilled. See Almeida, da Cruz Neto, Oliveira, Souza, Comput. Optim. Appl., 2020 for more details on the modified variant, where we slightly changed step 4-6, sine here we get the classical proximal point method for DC functions for $s_k = 1$ and we can employ linesearches similar to other solvers.

Optional parameters

• λ – ( i -> 1/2 ) a function returning the sequence of prox parameters λi
• evaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x).
• cost - (nothing) provide the cost f, e.g. for debug reasonscost to be used within the default sub_problem. Use this if you have a more efficient version than using g from above.
• gradient – (nothing) specify $\operatorname{grad} f$, for debug / analysis or enhancing the stopping_criterion
• prox_g - (nothing) specify a proximal map for the sub problem or both of the following
• g – (nothing) specify the function g.
• grad_g – (nothing) specify the gradient of g. If both gand grad_g are specified, a subsolver is automatically set up.
• inverse_retraction_method - (default_inverse_retraction_method(M)) an inverse retraction method to use (see step 4).
• retraction_method – (default_retraction_method(M)) a retraction to use (see step 2)
• stepsize – (ConstantStepsize(M)) specify a Stepsize to run the modified algorithm (experimental.) functor.
• stopping_criterion (StopAfterIteration(200) |StopWhenChangeLess(1e-8)) a StoppingCriterion for the algorithm – includes a StopWhenGradientNormLess(1e-8), when a gradient is provided.

While there are several parameters for a sub solver, the easiest is to provide the function g and grad_g, such that together with the mandatory function g a default cost and gradient can be generated and passed to a default subsolver. Hence the easiest example call looks like

difference_of_convex_proximal_point(M, grad_h, p0; g=g, grad_g=grad_g)

Optional parameters for the sub problem

• sub_cost – (ProximalDCCost(g, copy(M, p), λ(1))) cost to be used within the default sub_problem that is initialized as soon as g is provided.
• sub_grad – (ProximalDCGrad(grad_g, copy(M, p), λ(1); evaluation=evaluation) gradient to be used within the default sub_problem, that is initialized as soon as grad_g is provided. This is generated by default when grad_g is provided. You can specify your own by overwriting this keyword.
• sub_hess – (a finite difference approximation by default) specify a Hessian of the subproblem, which the default solver, see sub_state needs
• sub_kwargs – ([]) pass keyword arguments to the sub_state, in form of a Dict(:kwname=>value), unless you set the sub_state directly.
• sub_objective – (a gradient or hessian objective based on the last 3 keywords) provide the objective used within sub_problem (if that is not specified by the user)
• sub_problem – (DefaultManoptProblem(M, sub_objective) specify a manopt problem for the sub-solver runs. You can also provide a function for a closed form solution. Then evaluation= is taken into account for the form of this function.
• sub_state – (TrustRegionsState – requires the sub_hessian to be provided, decorated withsubkwargs) choose the solver by specifying a solver state to solve thesubproblem
• sub_stopping_criterion - (StopAfterIteration(300) |StopWhenStepsizeLess(1e-9) |StopWhenGradientNormLess(1e-9)) a stopping criterion used withing the default sub_state=

...all others are passed on to decorate the inner DifferenceOfConvexProximalState.

Output

the obtained (approximate) minimizer $p^*$, see get_solver_return for details

source
Manopt.difference_of_convex_proximal_point!Function
difference_of_convex_proximal_point!(M, grad_h, p; cost=nothing, kwargs...)
difference_of_convex_proximal_point!(M, mdcpo, p; cost=nothing, kwargs...)
difference_of_convex_proximal_point!(M, mdcpo, prox_g, p; cost=nothing, kwargs...)

Compute the difference of convex algorithm to minimize

$$$\operatorname*{arg\,min}_{p∈\mathcal M} g(p) - h(p)$$$

where you have to provide the proximal map of g and the gradient of h.

The computation is done inplace of p.

For all further details, especially the keyword arguments, see difference_of_convex_proximal_point.

source

## Solver states

Manopt.DifferenceOfConvexStateType
DifferenceOfConvexState{Pr,St,P,T,SC<:StoppingCriterion} <:
AbstractManoptSolverState

A struct to store the current state of the [difference_of_convex_algorithm])(@ref). It comes in two forms, depending on the realisation of the subproblem.

Fields

• p – the current iterate, i.e. a point on the manifold
• X – the current subgradient, i.e. a tangent vector to p.
• sub_problem – problem for the subsolver
• sub_state – state of the subproblem
• stop – a functor inheriting from StoppingCriterion indicating when to stop.

For the sub task, we need a method to solve

$$$\operatorname*{argmin}_{q∈\mathcal M}\ g(p) - ⟨X, \log_p q⟩$$$

besides a problem and options, one can also provide a function and an AbstractEvaluationType, respectively, to indicate a closed form solution for the sub task.

Constructors

DifferenceOfConvexState(M, p, sub_problem, sub_state; kwargs...)
DifferenceOfConvexState(M, p, sub_solver; evaluation=InplaceEvaluation(), kwargs...)

Generate the state either using a solver from Manopt, given by an AbstractManoptProblem sub_problem and an AbstractManoptSolverState sub_state, or a closed form solution sub_solver for the sub-problem, where by default its AbstractEvaluationType evaluation is in-place, i.e. the function is of the form (M, p, X) -> q or (M, q, p, X) -> q, such that the current iterate p and the subgradient X of h can be passed to that function and the result if q.

Further keyword Arguments

• initial_vector=zero_vector (zero_vectoir(M,p)) how to initialize the inner gradient tangent vector
• stopping_criterionStopAfterIteration(200) a stopping criterion
source
Manopt.DifferenceOfConvexProximalStateType
DifferenceOfConvexProximalState{Type} <: Options

A struct to store the current state of the algorithm as well as the form. It comes in two forms, depending on the realisation of the subproblem.

Fields

• inverse_retraction_method – (default_inverse_retraction_method(M)) an inverse retraction method to use within Frank Wolfe.
• retraction_method – (default_retraction_method(M)) a type of retraction
• p, q, r – the current iterate, the gradient step and the prox, respectively their type is set by initializing p
• stepsize – (ConstantStepsize(1.0)) a Stepsize function to run the modified algorithm (experimental)
• stop – (StopWhenChangeLess(1e-8)) a StoppingCriterion
• X, Y – (zero_vector(M,p)) the current gradient and descent direction, respectively their common type is set by the keyword X

Constructor

DifferenceOfConvexProximalState(M, p; kwargs...)

Keyword arguments

• X, retraction_method, inverse_retraction_method, stepsize for the fields above
• stoppping_criterion for the StoppingCriterion
source

## The difference of convex objective

Manopt.ManifoldDifferenceOfConvexObjectiveType
ManifoldDifferenceOfConvexObjective{E} <: AbstractManifoldCostObjective{E}

Specify an objective for a difference_of_convex_algorithm.

The objective $f: \mathcal M \to ℝ$ is given as

$$$f(p) = g(p) - h(p)$$$

where both $g$ and $h$ are convex, lsc. and proper. Furthermore we assume that the subdifferential $∂h$ of $h$ is given.

Fields

• cost – an implementation of $f(p) = g(p)-h(p)$ as a function f(M,p).
• ∂h!! – a deterministic version of $∂h: \mathcal M → T\mathcal M$, i.e. calling ∂h(M, p) returns a subgradient of $h$ at p and if there is more than one, it returns a deterministic choice.

Note that the subdifferential might be given in two possible signatures

source

as well as for the corresponding sub problem

Manopt.LinearizedDCCostType
LinearizedDCCost

A functor (M,q) → ℝ to represent the inner problem of a ManifoldDifferenceOfConvexObjective, i.e. a cost function of the form

$$$F_{p_k,X_k}(p) = g(p) - ⟨X_k, \log_{p_k}p⟩$$$

for a point p_k and a tangent vector X_k at p_k (e.g. outer iterates) that are stored within this functor as well.

Fields

• g a function
• pk a point on a manifold
• Xk a tangent vector at pk

Both interims values can be set using set_manopt_parameter!(::LinearizedDCCost, ::Val{:p}, p) and set_manopt_parameter!(::LinearizedDCCost, ::Val{:X}, X), respectively.

Constructor

LinearizedDCCost(g, p, X)
source
Manopt.LinearizedDCGradType
LinearizedDCGrad

A functor (M,X,p) → ℝ to represent the gradient of the inner problem of a ManifoldDifferenceOfConvexObjective, i.e. for a cost function of the form

$$$F_{p_k,X_k}(p) = g(p) - ⟨X_k, \log_{p_k}p⟩$$$

its gradient is given by using $F=F_1(F_2(p))$, where $F_1(X) = ⟨X_k,X⟩$ and $F_2(p) = \log_{p_k}p$ and the chain rule as well as the adjoint differential of the logarithmic map with respect to its argument for $D^*F_2(p)$

$$$\operatorname{grad} F(q) = \operatorname{grad} f(q) - DF_2^*(q)[X]$$$

for a point pk and a tangent vector Xk at pk (the outer iterates) that are stored within this functor as well

Fields

• grad_g!! the gradient of $g$ (see also LinearizedDCCost)
• pk a point on a manifold
• Xk a tangent vector at pk

Both interims values can be set using set_manopt_parameter!(::LinearizedDCGrad, ::Val{:p}, p) and set_manopt_parameter!(::LinearizedDCGrad, ::Val{:X}, X), respectively.

Constructor

LinearizedDCGrad(grad_g, p, X; evaluation=AllocatingEvaluation())

Where you specify whether grad_g is AllocatingEvaluation or InplaceEvaluation, while this function still provides both signatures.

source
Manopt.ManifoldDifferenceOfConvexProximalObjectiveType
ManifoldDifferenceOfConvexProximalObjective{E} <: Problem

Specify an objective difference_of_convex_proximal_point algorithm. The problem is of the form

$$$\operatorname*{argmin}_{p\in \mathcal M} g(p) - h(p)$$$

where both $g$ and $h$ are convex, lsc. and proper.

Fields

• cost – (nothing) implementation of $f(p) = g(p)-h(p)$ (optional)
• gradient - the gradient of the cost
• grad_h!! – a function $\operatorname{grad}h: \mathcal M → T\mathcal M$,

Note that both the gradients might be given in two possible signatures as allocating or Inplace.

Constructor

ManifoldDifferenceOfConvexProximalObjective(gradh; cost=nothing, gradient=nothing)

an note that neither cost nor gradient are required for the algorithm, just for eventual debug or stopping criteria.

source

as well as for the corresponding sub problems

Manopt.ProximalDCCostType
ProximalDCCost

A functor (M, p) → ℝ to represent the inner cost function of a ManifoldDifferenceOfConvexProximalObjective, i.e. the cost function of the proximal map of g.

$$$F_{p_k}(p) = \frac{1}{2λ}d_{\mathcal M}(p_k,p)^2 + g(p)$$$

for a point pk and a proximal parameter $λ$.

Fields

• g - a function
• pk - a point on a manifold
• λ - the prox parameter

Both interims values can be set using set_manopt_parameter!(::ProximalDCCost, ::Val{:p}, p) and set_manopt_parameter!(::ProximalDCCost, ::Val{:λ}, λ), respectively.

Constructor

ProximalDCCost(g, p, λ)
source
Manopt.ProximalDCGradType
ProximalDCGrad

A functor (M,X,p) → ℝ to represent the gradient of the inner cost function of a ManifoldDifferenceOfConvexProximalObjective, i.e. the gradient function of the proximal map cost function of g, i.e. of

$$$F_{p_k}(p) = \frac{1}{2λ}d_{\mathcal M}(p_k,p)^2 + g(p)$$$

$$$\operatorname{grad} F_{p_k}(p) = \operatorname{grad} g(p) - \frac{1}{λ}\log_p p_k$$$

for a point pk and a proximal parameter λ.

Fields

• grad_g - a gradient function
• pk - a point on a manifold
• λ - the prox parameter

Both interims values can be set using set_manopt_parameter!(::ProximalDCGrad, ::Val{:p}, p) and set_manopt_parameter!(::ProximalDCGrad, ::Val{:λ}, λ), respectively.

Constructor

ProximalDCGrad(grad_g, pk, λ; evaluation=AllocatingEvaluation())

Where you specify whether grad_g is AllocatingEvaluation or InplaceEvaluation, while this function still always provides both signatures.

source

## Helper functions

Manopt.get_subtrahend_gradientFunction
X = get_subtrahend_gradient(amp, q)
get_subtrahend_gradient!(amp, X, q)

Evaluate the (sub)gradient of the subtrahend h from within a ManifoldDifferenceOfConvexObjective amp at the point q (in place of X).

The evaluation is done in place of X for the !-variant. The T=AllocatingEvaluation problem might still allocate memory within. When the non-mutating variant is called with a T=InplaceEvaluation memory for the result is allocated.

source
X = get_subtrahend_gradient(M::AbstractManifold, dcpo::ManifoldDifferenceOfConvexProximalObjective, p)
get_subtrahend_gradient!(M::AbstractManifold, X, dcpo::ManifoldDifferenceOfConvexProximalObjective, p)

Evaluate the gradient of the subtrahend $h$ from within a ManifoldDifferenceOfConvexProximalObjectivePat the pointp (in place of X).

source

## Technical details

The difference_of_convex_algorithm and difference_of_convex_proximal_point solver requires the following functions of a manifold to be available

By default, one of the stopping criteria is StopWhenChangeLess, which either requires

## Literature

[ACOO20]
Y. T. Almeida, J. X. Cruz Neto, P. R. Oliveira and J. C. Oliveira Souza. A modified proximal point method for DC functions on Hadamard manifolds. Computational Optimization and Applications 76, 649–673 (2020).
[BFSS23]
R. Bergmann, O. P. Ferreira, E. M. Santos and J. C. Souza. The difference of convex algorithm on Hadamard manifolds, arXiv preprint (2023).
[SO15]
J. C. Souza and P. R. Oliveira. A proximal point algorithm for DC fuctions on Hadamard manifolds. Journal of Global Optimization 63, 797–810 (2015).