Proximal Gradient Method

Manopt.proximal_gradient_methodFunction
proximal_gradient_method(M, f, g, grad_g, p=rand(M); prox_nonsmooth=nothing, kwargs...)
proximal_gradient_method(M, mpgo::ManifoldProximalGradientObjective, p=rand(M); kwargs...)
proximal_gradient_method!(M, f, g, grad_g, p; prox_nonsmooth=nothing, kwargs...)
proximal_gradient_method!(M, mpgo::ManifoldProximalGradientObjective, p; kwargs...)

Perform the proximal gradient method as introduced in [BJJP25].

Given the minimization problem

\[\operatorname*{arg\,min}_{p∈\mathcal M} f(p), \quad \text{ where } \quad f(p) = g(p) + h(p).\]

This method performs the (intrinsic) proximal gradient method algorithm.

Let $λ_k ≥ 0$ be a sequence of (proximal) parameters, initialize $p^{(0)} = p$, and $k=0$.

Then perform as long as the stopping criterion is not fulfilled

\[p^{(k+1)} = prox_{λ_kh}\Bigl( \operatorname{retr}_{a^{(k)}}\bigl(-λ_k \operatorname{grad} g(a^{(k)}\bigr) \Bigr),\]

where $a^{(k)}=p^{(k)}$ by default, but it allows to introduce some acceleration before computing the gradient step.

Input

  • M::AbstractManifold: a Riemannian manifold $\mathcal M$
  • f: a cost function $f: \mathcal M→ ℝ$ implemented as (M, p) -> vtotal cost function f = g + h
  • g: the smooth part of the cost function
  • grad_g: a gradient (M,p) -> X or (M, X, p) -> X of the smooth part $g$ of the problem

Keyword Arguments

  • acceleration=(p, s, k) -> (copyto!(get_manifold(M), s.a, s.p); s): a function (problem, state, k) -> state to compute an acceleration, that is performed before the gradient step - the default is to copy the current point to the acceleration point, i.e. no acceleration is performed
  • evaluation=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.
  • prox_nonsmooth: a proximal map (M,λ,p) -> q or (M, q, λ, p) -> q for the (possibly) nonsmoooth part $h$ of $f$
  • p: a point on the manifold $\mathcal M$
  • stepsize=default_stepsize(M, ProximalGradientMethodState): a functor inheriting from Stepsize to determine a step size that by default uses a ProximalGradientMethodBacktracking.
  • retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
  • stopping_criterion=StopAfterIteration(100): a functor indicating that the stopping criterion is fulfilled
  • sub_problem=nothing: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.or nothing to take the proximal map from the ManifoldProximalGradientObjective
  • sub_state=evaluation: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.This field is ignored, if the sub_problem is Nothing
  • X=zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$storing the gradient at the current iterate

All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective decorators, respectively.

Output

The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.

source
Manopt.proximal_gradient_method!Function
proximal_gradient_method(M, f, g, grad_g, p=rand(M); prox_nonsmooth=nothing, kwargs...)
proximal_gradient_method(M, mpgo::ManifoldProximalGradientObjective, p=rand(M); kwargs...)
proximal_gradient_method!(M, f, g, grad_g, p; prox_nonsmooth=nothing, kwargs...)
proximal_gradient_method!(M, mpgo::ManifoldProximalGradientObjective, p; kwargs...)

Perform the proximal gradient method as introduced in [BJJP25].

Given the minimization problem

\[\operatorname*{arg\,min}_{p∈\mathcal M} f(p), \quad \text{ where } \quad f(p) = g(p) + h(p).\]

This method performs the (intrinsic) proximal gradient method algorithm.

Let $λ_k ≥ 0$ be a sequence of (proximal) parameters, initialize $p^{(0)} = p$, and $k=0$.

Then perform as long as the stopping criterion is not fulfilled

\[p^{(k+1)} = prox_{λ_kh}\Bigl( \operatorname{retr}_{a^{(k)}}\bigl(-λ_k \operatorname{grad} g(a^{(k)}\bigr) \Bigr),\]

where $a^{(k)}=p^{(k)}$ by default, but it allows to introduce some acceleration before computing the gradient step.

Input

  • M::AbstractManifold: a Riemannian manifold $\mathcal M$
  • f: a cost function $f: \mathcal M→ ℝ$ implemented as (M, p) -> vtotal cost function f = g + h
  • g: the smooth part of the cost function
  • grad_g: a gradient (M,p) -> X or (M, X, p) -> X of the smooth part $g$ of the problem

Keyword Arguments

  • acceleration=(p, s, k) -> (copyto!(get_manifold(M), s.a, s.p); s): a function (problem, state, k) -> state to compute an acceleration, that is performed before the gradient step - the default is to copy the current point to the acceleration point, i.e. no acceleration is performed
  • evaluation=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.
  • prox_nonsmooth: a proximal map (M,λ,p) -> q or (M, q, λ, p) -> q for the (possibly) nonsmoooth part $h$ of $f$
  • p: a point on the manifold $\mathcal M$
  • stepsize=default_stepsize(M, ProximalGradientMethodState): a functor inheriting from Stepsize to determine a step size that by default uses a ProximalGradientMethodBacktracking.
  • retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
  • stopping_criterion=StopAfterIteration(100): a functor indicating that the stopping criterion is fulfilled
  • sub_problem=nothing: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.or nothing to take the proximal map from the ManifoldProximalGradientObjective
  • sub_state=evaluation: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.This field is ignored, if the sub_problem is Nothing
  • X=zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$storing the gradient at the current iterate

All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective decorators, respectively.

Output

The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.

source
Manopt.ProximalGradientMethodAccelerationType
ProximalGradientMethodAcceleration{P, T, F}

Compute an acceleration step

\[a^{(k)} = \operatorname{retr}_{p^{(k)}}\bigl( -β_k\operatorname{retr}^{-1}_{p^{(k)}}(p) \bigr)\]

where p^{(k)} is the current iterate from the ProximalGradientMethodStates field p and the result is stored in state.a. The field p in this struct stores the last iterate.

The retraction and its inverse are taken from the state.

Fields

  • p - the last iterate
  • β - acceleration parameter function or value
  • inverse_retraction_method - method for inverse retraction
  • X - tangent vector for computations

Constructor

ProximalGradientMethodAcceleration(M::AbstractManifold; kwargs...)

Generate the state for a given manifold M with initial iterate p.

Input

Keyword arguments

  • β = k -> (k-1)/(k+2) - acceleration parameter function or value
  • inverse_retraction_method - method for inverse retraction
  • p - initial point
  • X - initial tangent vector
source

State

Manopt.ProximalGradientMethodStateType
ProximalGradientMethodState <: AbstractManoptSolverState

State for the proximal_gradient_method solver.

Fields

  • inverse_retraction_method::AbstractInverseRetractionMethod: an inverse retraction $\operatorname{retr}^{-1}$ to use, see the section on retractions and their inverses
  • a - point after acceleration step
  • p::P: a point on the manifold $\mathcal M$ storing the current iterate
  • q - point for storing gradient step
  • retraction_method::AbstractRetractionMethod: a retraction $\operatorname{retr}$ to use, see the section on retractions
  • X - tangent vector for storing gradient
  • stop::StoppingCriterion: a functor indicating that the stopping criterion is fulfilled
  • acceleration - a function (problem, state, k) -> state to compute an acceleration before the gradient step
  • stepsize - a function or Stepsize object to compute the stepsize
  • last_stepsize - stores the last computed stepsize
  • sub_problem::Union{AbstractManoptProblem, F}: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.or nothing to take the proximal map from the ManifoldProximalGradientObjective
  • sub_state::Union{AbstractManoptProblem, F}: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.This field is ignored, if the sub_problem is Nothing

Constructor

ProximalGradientMethodState(M::AbstractManifold; kwargs...)

Generate the state for a given manifold M with initial iterate p.

Input

Keyword arguments

  • stepsize=default_stepsize(M, ProximalGradientMethodState)
  • inverse_retraction_method::AbstractInverseRetractionMethod: an inverse retraction $\operatorname{retr}^{-1}$ to use, see the section on retractions and their inverses
  • p=rand(M): a point on the manifold $\mathcal M$ to specify the initial value
  • retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
  • acceleration=(p, s, k) -> (copyto!(get_manifold(M), s.a, s.p); s) by default no acceleration is performed
  • stopping_criterion=StopAfterIteration(100): a functor indicating that the stopping criterion is fulfilled
  • sub_problem=nothing: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.
  • sub_state=AllocatingEvaluation(): a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.
  • X=zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$to specify the representation of a tangent vector
source

Helping functions

Manopt.ProximalGradientNonsmoothSubgradientType
ProximalGradientNonsmoothSubgradient{F, R, P}

Stores a subgradient of the nonsmooth part h of the proximal gradient objective f = g + h, as well as the stepsize parameter $λ ∈ ℝ$.

This struct is also a functor in both formats * (M, p) -> X to compute the gradient in allocating fashion. This is primarily used for computing a subgradient of the cost function h(q) + \frac{1}{2λ} d^2(q, p) that defines proximal map in the proximal gradient method. This reads

\[ \partial h(q) - \frac{1}{λ} \operatorname{log}_q p\]

where p is the proximity point where the proximal map is evaluated, i.e. the argument p of the proximal map \operatorname{prox}_{λ} h.

Fields

  • X::F - the subgradient of the nonsmooth part of the total objective, i.e. the part of the objective whose proximal map is sought
  • λ::R - the stepsize parameter for the proximal map
  • proximity_point::P - point where the proximal map is evaluated, i.e. the argument of the proximal map that we want to solve for

Constructor

ProximalGradientNonsmoothSubgradient(cost, λ, proximity_point)
source
Manopt.ProximalGradientNonsmoothCostType
ProximalGradientNonsmoothCost{F, R, P}

Stores the nonsmooth part h of the proximal gradient objective f = g + h, as well as the stepsize parameter $λ ∈ ℝ$.

This struct is also a functor (M, q) -> v that can be used as a cost function within a solver, primarily for solving the proximal map subproblem formulation in the proximal gradient method, which reads

\[ \operatorname{prox}_{λ} h(p) = \operatorname{argmin}_{q \in \mathcal M} \left( h(q) + \frac{1}{2λ} d^2(q, p) \right)\]

Hence, the functor reads

\[ (M, q) \mapsto h(q) + \frac{1}{2λ} d^2(q, p)\]

and p is the proximity point where the proximal map is evaluated, i.e. the argument p of the proximal map \operatorname{prox}_{λ} h.

Fields

  • cost::F - the nonsmooth part h of the proximal gradient objective, i.e. the part of the objective whose proximal map is sought
  • λ::R - the stepsize parameter for the proximal map
  • proximity_point::P - point where the proximal map is evaluated, i.e. the argument p of the proximal map \operatorname{prox}_{λ} h that we want to solve for

Constructor

ProximalGradientNonsmoothCost(cost, λ, proximity_point)
source

Stopping criteria

Manopt.StopWhenGradientMappingNormLessType
StopWhenGradientMappingNormLess <: StoppingCriterion

A stopping criterion based on the gradient mapping norm for proximal gradient methods.

Fields

  • at_iteration::Int: an integer indicating at which the stopping criterion last indicted to stop, which might also be before the solver started (0). Any negative value indicates that this was not yet the case;
  • last_change::Real: the last change recorded in this stopping criterion
  • threshold: the threshold for the change to check (run under to stop)

Constructor

StopWhenGradientMappingNormLess(ε)

Create a stopping criterion with threshold ε for the gradient mapping for the proximal_gradient_method. That is, this criterion indicates to stop when the gradient mapping has a norm less than ε. The gradient mapping Gλ(p) is defined as -(1/λ) * logp(Tλ(p)), where Tλ(p) is the proximal mapping proxλ f(expp(-λ * grad f(p))).

source

Stepsize

Manopt.ProximalGradientMethodBacktrackingFunction
ProximalGradientMethodBacktracking(; kwargs...)
ProximalGradientMethodBacktracking(M::AbstractManifold; kwargs...)

Compute a stepsize for the proximal gradient method using a backtracking line search.

For the nonconvex case, the condition is:

\[f(p) - f(T_{λ}(p)) ≥ γλ\lVert G_{λ}(p) \rVert_{}^2\]

where G_{λ}(p) = (1/λ) * \log_p(T_{λ}(p)) is the gradient mapping.

For the convex case, the condition is:

\[g(T_{λ}(p)) ≤ g(p) + ⟨\operatorname{grad} g(p), \log_p T_{λ}(p)⟩ + \frac{1}{2λ} \mathrm{d}^2(p, T_{λ}(p))\]

Returns a stepsize λ that satisfies the specified condition.

Info

This function generates a ManifoldDefaultsFactory for ProximalGradientMethodBacktrackingStepsize. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.

source
Manopt.ProximalGradientMethodBacktrackingStepsizeType
ProximalGradientMethodBacktrackingStepsize <: Stepsize

A functor for backtracking line search in proximal gradient methods.

Fields

  • initial_stepsize::T - initial step size guess
  • sufficient_decrease::T - sufficient decrease parameter (default: 0.5)
  • contraction_factor::T - step size reduction factor (default: 0.5)
  • strategy::Symbol - :nonconvex or :convex (default: :nonconvex)
  • candidate_point::P - a working point used during backtracking
  • last_stepsize::T - the last computed stepsize

Constructor

ProximalGradientMethodBacktrackingStepsize(M::AbstractManifold; kwargs...)

Keyword arguments

  • initial_stepsize=1.0: initial stepsize to try
  • stop_when_stepsize_less=1e-8: smallest stepsize when to stop (the last one before is taken)
  • sufficient_decrease=0.5: sufficient decrease parameter
  • contraction_factor=0.5: step size reduction factor
  • strategy=:nonconvex: backtracking strategy, either :convex or :nonconvex
source

Internal functions

Manopt.get_cost_smoothFunction
get_cost_smooth(M::AbstractManifold, objective, p)

Helper function to extract the smooth part g of a proximal gradient objective at the point p.

source
Manopt.default_stepsizeMethod
default_stepsize(M::AbstractManifold, ::Type{<:ProximalGradientMethodState})

Returns the default proximal stepsize, which is a nonconvex backtracking strategy.

source

Literature

[BJJP25]
R. Bergmann, H. Jasa, P. John and M. Pfeffer. The Intrinsic Riemannian Proximal Gradient Method for Nonconvex Optimization, preprint (2025), arXiv:2506.09775.