Difference of convex

Difference of convex algorithm

Manopt.difference_of_convex_algorithmFunction
difference_of_convex_algorithm(M, f, g, ∂h, p=rand(M); kwargs...)
difference_of_convex_algorithm(M, mdco, p; kwargs...)

Compute the difference of convex algorithm [BFSS23] to minimize

\[ \operatorname*{arg\,min}_{p∈\mathcal M}\ g(p) - h(p)\]

where you need to provide $f(p) = g(p) - h(p)$, $g$ and the subdifferential $∂h$ of $h$.

This algorithm performs the following steps given a start point p= $p^{(0)}$. Then repeat for $k=0,1,\ldots$

  1. Take $X^{(k)} ∈ ∂h(p^{(k)})$
  2. Set the next iterate to the solution of the subproblem

\[ p^{(k+1)} ∈ \operatorname*{argmin}_{q ∈ \mathcal M} g(q) - ⟨X^{(k)}, \log_{p^{(k)}}q⟩\]

until the stopping_criterion is fulfilled.

Optional parameters

  • evaluation (AllocatingEvaluation) specify whether the gradient works by allocation (default) form grad_f(M, p) or InplaceEvaluation form grad_f!(M, X, x)
  • gradient (nothing) specify $\operatorname{grad} f$, for debug / analysis or enhancing stopping_criterion=
  • grad_g (nothing) specify the gradient of g. If specified, a subsolver is automatically set up.
  • initial_vector (zero_vector(M, p)) initialise the inner tangent vector to store the subgradient result.
  • stopping_criterion (StopAfterIteration(200) |StopWhenChangeLess(1e-8)) a StoppingCriterion for the algorithm. This includes a StopWhenGradientNormLess(1e-8), when a gradient is provided.

if you specify the ManifoldDifferenceOfConvexObjective mdco, additionally

  • g - (nothing) specify the function g If specified, a subsolver is automatically set up.

While there are several parameters for a sub solver, the easiest is to provide the function grad_g=, such that together with the mandatory function g a default cost and gradient can be generated and passed to a default subsolver. Hence the easiest example call looks like

difference_of_convex_algorithm(M, f, g, grad_h, p; grad_g=grad_g)

Optional parameters for the sub problem

  • sub_cost (LinearizedDCCost(g, p, initial_vector)) a cost to be used within the default sub_problem Use this if you have a more efficient version than the default that is built using g from before.
  • sub_grad (LinearizedDCGrad(grad_g, p, initial_vector; evaluation=evaluation) gradient to be used within the default sub_problem. This is generated by default when grad_g is provided. You can specify your own by overwriting this keyword.
  • sub_hess (a finite difference approximation by default) specify a Hessian of the subproblem, which the default solver, see sub_state needs
  • sub_kwargs ((;)) pass keyword arguments to the sub_state, in form of a Dict(:kwname=>value), unless you set the sub_state directly.
  • sub_objective (a gradient or Hessian objective based on the last 3 keywords) provide the objective used within sub_problem (if that is not specified by the user)
  • sub_problem (DefaultManoptProblem(M, sub_objective) specify a manopt problem for the sub-solver runs. You can also provide a function for a closed form solution. Then evaluation= is taken into account for the form of this function.
  • sub_state (TrustRegionsState by default, requires sub_hessian to be provided; decorated with sub_kwargs). Choose the solver by specifying a solver state to solve the sub_problem if the sub_problem if a function (a closed form solution), this is set to evaluation and can be changed to the evaluation type of the closed form solution accordingly.
  • sub_stopping_criterion (StopAfterIteration(300) |StopWhenStepsizeLess(1e-9) |StopWhenGradientNormLess(1e-9)) a stopping criterion used withing the default sub_state=
  • sub_stepsize (ArmijoLinesearch(M)) specify a step size used within the sub_state,

all others are passed on to decorate the inner DifferenceOfConvexState.

Output

the obtained (approximate) minimizer $p^*$, see get_solver_return for details

source

Difference of convex proximal point

Manopt.difference_of_convex_proximal_pointFunction
difference_of_convex_proximal_point(M, grad_h, p=rand(M); kwargs...)
difference_of_convex_proximal_point(M, mdcpo, p=rand(M); kwargs...)

Compute the difference of convex proximal point algorithm [SO15] to minimize

\[ \operatorname*{arg\,min}_{p∈\mathcal M} g(p) - h(p)\]

where you have to provide the (sub) gradient $∂h$ of $h$ and either

  • the proximal map $\operatorname{prox}_{\lambda g}$ of g as a function prox_g(M, λ, p) or prox_g(M, q, λ, p)
  • the functions g and grad_g to compute the proximal map using a sub solver
  • your own sub-solver, see optional keywords below

This algorithm performs the following steps given a start point p= $p^{(0)}$. Then repeat for $k=0,1,\ldots$

  1. $X^{(k)} ∈ \operatorname{grad} h(p^{(k)})$
  2. $q^{(k)} = \operatorname{retr}_{p^{(k)}}(λ_kX^{(k)})$
  3. $r^{(k)} = \operatorname{prox}_{λ_kg}(q^{(k)})$
  4. $X^{(k)} = \operatorname{retr}^{-1}_{p^{(k)}}(r^{(k)})$
  5. Compute a stepsize $s_k$ and
  6. set $p^{(k+1)} = \operatorname{retr}_{p^{(k)}}(s_kX^{(k)})$.

until the stopping_criterion is fulfilled. See [ACOO20] for more details on the modified variant, where steps 4-6 are slightly changed, since here the classical proximal point method for DC functions is obtained for $s_k = 1$ and one can hence employ usual line search method.

Optional parameters

  • λ: ( i -> 1/2 ) a function returning the sequence of prox parameters λi
  • evaluation: (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place of the form gradF!(M, X, x).
  • cost: (nothing) provide the cost f, for debug reasons / analysis the default sub_problem. Use this if you have a more efficient version than using g from before.
  • gradient: (nothing) specify $\operatorname{grad} f$, for debug / analysis or enhancing the stopping_criterion
  • prox_g: (nothing) specify a proximal map for the sub problem or both of the following
  • g: (nothing) specify the function g.
  • grad_g: (nothing) specify the gradient of g. If both gand grad_g are specified, a subsolver is automatically set up.
  • inverse_retraction_method: (default_inverse_retraction_method(M)) an inverse retraction method to use (see step 4).
  • retraction_method: (default_retraction_method(M)) a retraction to use (see step 2)
  • stepsize: (ConstantStepsize(M)) specify a Stepsize to run the modified algorithm (experimental.) functor.
  • stopping_criterion: (StopAfterIteration(200) |StopWhenChangeLess(1e-8)) a StoppingCriterion for the algorithm, also includes a StopWhenGradientNormLess(1e-8), when a gradient is provided.

While there are several parameters for a sub solver, the easiest is to provide the function g and grad_g, such that together with the mandatory function g a default cost and gradient can be generated and passed to a default subsolver. Hence the easiest example call looks like

difference_of_convex_proximal_point(M, grad_h, p0; g=g, grad_g=grad_g)

Optional parameters for the sub problem

  • sub_cost: (ProximalDCCost(g, copy(M, p), λ(1))) cost to be used within the default sub_problem that is initialized as soon as g is provided.
  • sub_grad: (ProximalDCGrad(grad_g, copy(M, p), λ(1); evaluation=evaluation) gradient to be used within the default sub_problem, that is initialized as soon as grad_g is provided. This is generated by default when grad_g is provided. You can specify your own by overwriting this keyword.
  • sub_hess: (a finite difference approximation by default) specify a Hessian of the subproblem, which the default solver, see sub_state needs
  • sub_kwargs: ((;)) pass keyword arguments to the sub_state, in form of a Dict(:kwname=>value), unless you set the sub_state directly.
  • sub_objective: (a gradient or Hessian objective based on the last 3 keywords) provide the objective used within sub_problem (if that is not specified by the user)
  • sub_problem: (DefaultManoptProblem(M, sub_objective) specify a manopt problem for the sub-solver runs. You can also provide a function for a closed form solution. Then evaluation= is taken into account for the form of this function.
  • sub_state: (TrustRegionsState). requires the sub_Hessian to be provided, decorated withsubkwargs) choose the solver by specifying a solver state to solve thesubproblem`
  • sub_stopping_criterion: (StopAfterIteration(300) |StopWhenStepsizeLess(1e-9) |StopWhenGradientNormLess(1e-9)) a stopping criterion used withing the default sub_state=

all others are passed on to decorate the inner DifferenceOfConvexProximalState.

Output

the obtained (approximate) minimizer $p^*$, see get_solver_return for details

source
Manopt.difference_of_convex_proximal_point!Function
difference_of_convex_proximal_point!(M, grad_h, p; cost=nothing, kwargs...)
difference_of_convex_proximal_point!(M, mdcpo, p; cost=nothing, kwargs...)
difference_of_convex_proximal_point!(M, mdcpo, prox_g, p; cost=nothing, kwargs...)

Compute the difference of convex algorithm to minimize

\[ \operatorname*{arg\,min}_{p∈\mathcal M} g(p) - h(p)\]

where you have to provide the proximal map of g and the gradient of h.

The computation is done in-place of p.

For all further details, especially the keyword arguments, see difference_of_convex_proximal_point.

source

Solver states

Manopt.DifferenceOfConvexStateType
DifferenceOfConvexState{Pr,St,P,T,SC<:StoppingCriterion} <:
           AbstractManoptSolverState

A struct to store the current state of the [difference_of_convex_algorithm])(@ref). It comes in two forms, depending on the realisation of the subproblem.

Fields

  • p the current iterate, a point on the manifold
  • X the current subgradient, a tangent vector to p.
  • sub_problem problem for the subsolver
  • sub_state state of the subproblem
  • stop a functor inheriting from StoppingCriterion indicating when to stop.

For the sub task, a method to solve

\[ \operatorname*{argmin}_{q∈\mathcal M}\ g(p) - ⟨X, \log_p q⟩\]

is needed. Besides a problem and options, one can also provide a function and an AbstractEvaluationType, respectively, to indicate a closed form solution for the sub task.

Constructors

DifferenceOfConvexState(M, p, sub_problem, sub_state; kwargs...)
DifferenceOfConvexState(M, p, sub_solver; evaluation=InplaceEvaluation(), kwargs...)

Generate the state either using a solver from Manopt, given by an AbstractManoptProblem sub_problem and an AbstractManoptSolverState sub_state, or a closed form solution sub_solver for the sub-problem the function expected to be of the form (M, p, X) -> q or (M, q, p, X) -> q, where by default its AbstractEvaluationType evaluation is in-place of q. Here the elements passed are the current iterate p and the subgradient X of h can be passed to that function.

further keyword arguments

  • initial_vector=zero_vector (zero_vectoir(M,p)) how to initialize the inner gradient tangent vector
  • stopping_criterion a StopAfterIteration(200) a stopping criterion
source
Manopt.DifferenceOfConvexProximalStateType
DifferenceOfConvexProximalState{Type} <: Options

A struct to store the current state of the algorithm as well as the form. It comes in two forms, depending on the realisation of the subproblem.

Fields

  • inverse_retraction_method: (default_inverse_retraction_method(M)) an inverse retraction method to use within Frank Wolfe.
  • retraction_method: (default_retraction_method(M)) a type of retraction
  • p, q, r: the current iterate, the gradient step and the prox, respectively their type is set by initializing p
  • stepsize: (ConstantStepsize(1.0)) a Stepsize function to run the modified algorithm (experimental)
  • stop: (StopWhenChangeLess(1e-8)) a StoppingCriterion
  • X, Y: (zero_vector(M,p)) the current gradient and descent direction, respectively their common type is set by the keyword X

Constructor

DifferenceOfConvexProximalState(M, p; kwargs...)

Keyword arguments

  • X, retraction_method, inverse_retraction_method, stepsize for the corresponding fields
  • stoppping_criterion for the StoppingCriterion
source

The difference of convex objective

Manopt.ManifoldDifferenceOfConvexObjectiveType
ManifoldDifferenceOfConvexObjective{E} <: AbstractManifoldCostObjective{E}

Specify an objective for a difference_of_convex_algorithm.

The objective $f: \mathcal M → ℝ$ is given as

\[ f(p) = g(p) - h(p)\]

where both $g$ and $h$ are convex, lower semicontinuous and proper. Furthermore the subdifferential $∂h$ of $h$ is required.

Fields

  • cost: an implementation of $f(p) = g(p)-h(p)$ as a function f(M,p).
  • ∂h!!: a deterministic version of $∂h: \mathcal M → T\mathcal M$, in the sense that calling ∂h(M, p) returns a subgradient of $h$ at p and if there is more than one, it returns a deterministic choice.

Note that the subdifferential might be given in two possible signatures

source

as well as for the corresponding sub problem

Manopt.LinearizedDCCostType
LinearizedDCCost

A functor (M,q) → ℝ to represent the inner problem of a ManifoldDifferenceOfConvexObjective. This is a cost function of the form

\[ F_{p_k,X_k}(p) = g(p) - ⟨X_k, \log_{p_k}p⟩\]

for a point p_k and a tangent vector X_k at p_k (for example outer iterates) that are stored within this functor as well.

Fields

  • g a function
  • pk a point on a manifold
  • Xk a tangent vector at pk

Both interim values can be set using set_manopt_parameter!(::LinearizedDCCost, ::Val{:p}, p) and set_manopt_parameter!(::LinearizedDCCost, ::Val{:X}, X), respectively.

Constructor

LinearizedDCCost(g, p, X)
source
Manopt.LinearizedDCGradType
LinearizedDCGrad

A functor (M,X,p) → ℝ to represent the gradient of the inner problem of a ManifoldDifferenceOfConvexObjective. This is a gradient function of the form

\[ F_{p_k,X_k}(p) = g(p) - ⟨X_k, \log_{p_k}p⟩\]

its gradient is given by using $F=F_1(F_2(p))$, where $F_1(X) = ⟨X_k,X⟩$ and $F_2(p) = \log_{p_k}p$ and the chain rule as well as the adjoint differential of the logarithmic map with respect to its argument for $D^*F_2(p)$

\[ \operatorname{grad} F(q) = \operatorname{grad} f(q) - DF_2^*(q)[X]\]

for a point pk and a tangent vector Xk at pk (the outer iterates) that are stored within this functor as well

Fields

  • grad_g!! the gradient of $g$ (see also LinearizedDCCost)
  • pk a point on a manifold
  • Xk a tangent vector at pk

Both interim values can be set using set_manopt_parameter!(::LinearizedDCGrad, ::Val{:p}, p) and set_manopt_parameter!(::LinearizedDCGrad, ::Val{:X}, X), respectively.

Constructor

LinearizedDCGrad(grad_g, p, X; evaluation=AllocatingEvaluation())

Where you specify whether grad_g is AllocatingEvaluation or InplaceEvaluation, while this function still provides both signatures.

source
Manopt.ManifoldDifferenceOfConvexProximalObjectiveType
ManifoldDifferenceOfConvexProximalObjective{E} <: Problem

Specify an objective difference_of_convex_proximal_point algorithm. The problem is of the form

\[ \operatorname*{argmin}_{p∈\mathcal M} g(p) - h(p)\]

where both $g$ and $h$ are convex, lower semicontinuous and proper.

Fields

  • cost: (nothing) implementation of $f(p) = g(p)-h(p)$ (optional)
  • gradient: the gradient of the cost
  • grad_h!!: a function $\operatorname{grad}h: \mathcal M → T\mathcal M$,

Note that both the gradients might be given in two possible signatures as allocating or in-place.

Constructor

ManifoldDifferenceOfConvexProximalObjective(gradh; cost=nothing, gradient=nothing)

an note that neither cost nor gradient are required for the algorithm, just for eventual debug or stopping criteria.

source

as well as for the corresponding sub problems

Manopt.ProximalDCCostType
ProximalDCCost

A functor (M, p) → ℝ to represent the inner cost function of a ManifoldDifferenceOfConvexProximalObjective. This is the cost function of the proximal map of g.

\[ F_{p_k}(p) = \frac{1}{2λ}d_{\mathcal M}(p_k,p)^2 + g(p)\]

for a point pk and a proximal parameter $λ$.

Fields

  • g - a function
  • pk - a point on a manifold
  • λ - the prox parameter

Both interim values can be set using set_manopt_parameter!(::ProximalDCCost, ::Val{:p}, p) and set_manopt_parameter!(::ProximalDCCost, ::Val{:λ}, λ), respectively.

Constructor

ProximalDCCost(g, p, λ)
source
Manopt.ProximalDCGradType
ProximalDCGrad

A functor (M,X,p) → ℝ to represent the gradient of the inner cost function of a ManifoldDifferenceOfConvexProximalObjective. This is the gradient function of the proximal map cost function of g. Based on

\[ F_{p_k}(p) = \frac{1}{2λ}d_{\mathcal M}(p_k,p)^2 + g(p)\]

it reads

\[ \operatorname{grad} F_{p_k}(p) = \operatorname{grad} g(p) - \frac{1}{λ}\log_p p_k\]

for a point pk and a proximal parameter λ.

Fields

  • grad_g - a gradient function
  • pk - a point on a manifold
  • λ - the prox parameter

Both interim values can be set using set_manopt_parameter!(::ProximalDCGrad, ::Val{:p}, p) and set_manopt_parameter!(::ProximalDCGrad, ::Val{:λ}, λ), respectively.

Constructor

ProximalDCGrad(grad_g, pk, λ; evaluation=AllocatingEvaluation())

Where you specify whether grad_g is AllocatingEvaluation or InplaceEvaluation, while this function still always provides both signatures.

source

Helper functions

Manopt.get_subtrahend_gradientFunction
X = get_subtrahend_gradient(amp, q)
get_subtrahend_gradient!(amp, X, q)

Evaluate the (sub)gradient of the subtrahend h from within a ManifoldDifferenceOfConvexObjective amp at the point q (in place of X).

The evaluation is done in place of X for the !-variant. The T=AllocatingEvaluation problem might still allocate memory within. When the non-mutating variant is called with a T=InplaceEvaluation memory for the result is allocated.

source
X = get_subtrahend_gradient(M::AbstractManifold, dcpo::ManifoldDifferenceOfConvexProximalObjective, p)
get_subtrahend_gradient!(M::AbstractManifold, X, dcpo::ManifoldDifferenceOfConvexProximalObjective, p)

Evaluate the gradient of the subtrahend $h$ from within a ManifoldDifferenceOfConvexProximalObjectivePat the pointp` (in place of X).

source

Technical details

The difference_of_convex_algorithm and difference_of_convex_proximal_point solver requires the following functions of a manifold to be available

  • A retract!(M, q, p, X); it is recommended to set the default_retraction_method to a favourite retraction. If this default is set, a retraction_method= or retraction_method_dual= (for $\mathcal N$) does not have to be specified.
  • An inverse_retract!(M, X, p, q); it is recommended to set the default_inverse_retraction_method to a favourite retraction. If this default is set, a inverse_retraction_method= or inverse_retraction_method_dual= (for $\mathcal N$) does not have to be specified.

By default, one of the stopping criteria is StopWhenChangeLess, which either requires

  • A retract!(M, q, p, X); it is recommended to set the default_retraction_method to a favourite retraction. If this default is set, a retraction_method= or retraction_method_dual= (for $\mathcal N$) does not have to be specified.
  • An inverse_retract!(M, X, p, q); it is recommended to set the default_inverse_retraction_method to a favourite retraction. If this default is set, a inverse_retraction_method= or inverse_retraction_method_dual= (for $\mathcal N$) does not have to be specified or the distance(M, p, q) for said default inverse retraction.
  • A `copyto!(M, q, p) and copy(M,p) for points.
  • By default the tangent vector storing the gradient is initialized calling zero_vector(M,p).
  • everything the subsolver requires, which by default is the trust_regions or if you do not provide a Hessian gradient_descent.

Literature

[ACOO20]
Y. T. Almeida, J. X. Cruz Neto, P. R. Oliveira and J. C. Oliveira Souza. A modified proximal point method for DC functions on Hadamard manifolds. Computational Optimization and Applications 76, 649–673 (2020).
[BFSS23]
R. Bergmann, O. P. Ferreira, E. M. Santos and J. C. Souza. The difference of convex algorithm on Hadamard manifolds, arXiv preprint (2023).
[SO15]
J. C. Souza and P. R. Oliveira. A proximal point algorithm for DC fuctions on Hadamard manifolds. Journal of Global Optimization 63, 797–810 (2015).