Proximal Gradient Method
Manopt.proximal_gradient_method
— Functionproximal_gradient_method(M, f, g, grad_g, p=rand(M); prox_nonsmooth=nothing, kwargs...)
proximal_gradient_method(M, mpgo::ManifoldProximalGradientObjective, p=rand(M); kwargs...)
proximal_gradient_method!(M, f, g, grad_g, p; prox_nonsmooth=nothing, kwargs...)
proximal_gradient_method!(M, mpgo::ManifoldProximalGradientObjective, p; kwargs...)
Perform the proximal gradient method as introduced in [BJJP25].
Given the minimization problem
\[\operatorname*{arg\,min}_{p∈\mathcal M} f(p), \quad \text{ where } \quad f(p) = g(p) + h(p).\]
This method performs the (intrinsic) proximal gradient method algorithm.
Let $λ_k ≥ 0$ be a sequence of (proximal) parameters, initialize $p^{(0)} = p$, and $k=0$.
Then perform as long as the stopping criterion is not fulfilled
\[p^{(k+1)} = prox_{λ_kh}\Bigl( \operatorname{retr}_{a^{(k)}}\bigl(-λ_k \operatorname{grad} g(a^{(k)}\bigr) \Bigr),\]
where $a^{(k)}=p^{(k)}$ by default, but it allows to introduce some acceleration before computing the gradient step.
Input
M::
AbstractManifold
: a Riemannian manifold $\mathcal M$
f
: a cost function $f: \mathcal M→ ℝ$ implemented as(M, p) -> v
total cost functionf = g + h
g
: the smooth part of the cost functiongrad_g
: a gradient(M,p) -> X
or(M, X, p) -> X
of the smooth part $g$ of the problem
Keyword Arguments
acceleration=(p, s, k) -> (copyto!(get_manifold(M), s.a, s.p); s)
: a function(problem, state, k) -> state
to compute an acceleration, that is performed before the gradient step - the default is to copy the current point to the acceleration point, i.e. no acceleration is performedevaluation=
AllocatingEvaluation
()
: specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation
) or whether they modify their input argument to return the result therein (InplaceEvaluation
). Since usually the first argument is the manifold, the modified argument is the second.prox_nonsmooth
: a proximal map(M,λ,p) -> q
or(M, q, λ, p) -> q
for the (possibly) nonsmoooth part $h$ of $f$p
: a point on the manifold $\mathcal M$stepsize=
default_stepsize
(M, ProximalGradientMethodState)
: a functor inheriting fromStepsize
to determine a step size that by default uses aProximalGradientMethodBacktracking
.retraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsstopping_criterion=
StopAfterIteration
(100)
: a functor indicating that the stopping criterion is fulfilledsub_problem=
nothing: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.or nothing to take the proximal map from theManifoldProximalGradientObjective
sub_state=
evaluation: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.This field is ignored, if thesub_problem
isNothing
X=
zero_vector
(M, p)
: a tangent vector at the point $p$ on the manifold $\mathcal M$storing the gradient at the current iterate
All other keyword arguments are passed to decorate_state!
for state decorators or decorate_objective!
for objective decorators, respectively.
Output
The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return
for details, especially the return_state=
keyword.
Manopt.proximal_gradient_method!
— Functionproximal_gradient_method(M, f, g, grad_g, p=rand(M); prox_nonsmooth=nothing, kwargs...)
proximal_gradient_method(M, mpgo::ManifoldProximalGradientObjective, p=rand(M); kwargs...)
proximal_gradient_method!(M, f, g, grad_g, p; prox_nonsmooth=nothing, kwargs...)
proximal_gradient_method!(M, mpgo::ManifoldProximalGradientObjective, p; kwargs...)
Perform the proximal gradient method as introduced in [BJJP25].
Given the minimization problem
\[\operatorname*{arg\,min}_{p∈\mathcal M} f(p), \quad \text{ where } \quad f(p) = g(p) + h(p).\]
This method performs the (intrinsic) proximal gradient method algorithm.
Let $λ_k ≥ 0$ be a sequence of (proximal) parameters, initialize $p^{(0)} = p$, and $k=0$.
Then perform as long as the stopping criterion is not fulfilled
\[p^{(k+1)} = prox_{λ_kh}\Bigl( \operatorname{retr}_{a^{(k)}}\bigl(-λ_k \operatorname{grad} g(a^{(k)}\bigr) \Bigr),\]
where $a^{(k)}=p^{(k)}$ by default, but it allows to introduce some acceleration before computing the gradient step.
Input
M::
AbstractManifold
: a Riemannian manifold $\mathcal M$
f
: a cost function $f: \mathcal M→ ℝ$ implemented as(M, p) -> v
total cost functionf = g + h
g
: the smooth part of the cost functiongrad_g
: a gradient(M,p) -> X
or(M, X, p) -> X
of the smooth part $g$ of the problem
Keyword Arguments
acceleration=(p, s, k) -> (copyto!(get_manifold(M), s.a, s.p); s)
: a function(problem, state, k) -> state
to compute an acceleration, that is performed before the gradient step - the default is to copy the current point to the acceleration point, i.e. no acceleration is performedevaluation=
AllocatingEvaluation
()
: specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation
) or whether they modify their input argument to return the result therein (InplaceEvaluation
). Since usually the first argument is the manifold, the modified argument is the second.prox_nonsmooth
: a proximal map(M,λ,p) -> q
or(M, q, λ, p) -> q
for the (possibly) nonsmoooth part $h$ of $f$p
: a point on the manifold $\mathcal M$stepsize=
default_stepsize
(M, ProximalGradientMethodState)
: a functor inheriting fromStepsize
to determine a step size that by default uses aProximalGradientMethodBacktracking
.retraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsstopping_criterion=
StopAfterIteration
(100)
: a functor indicating that the stopping criterion is fulfilledsub_problem=
nothing: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.or nothing to take the proximal map from theManifoldProximalGradientObjective
sub_state=
evaluation: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.This field is ignored, if thesub_problem
isNothing
X=
zero_vector
(M, p)
: a tangent vector at the point $p$ on the manifold $\mathcal M$storing the gradient at the current iterate
All other keyword arguments are passed to decorate_state!
for state decorators or decorate_objective!
for objective decorators, respectively.
Output
The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return
for details, especially the return_state=
keyword.
Manopt.ProximalGradientMethodAcceleration
— TypeProximalGradientMethodAcceleration{P, T, F}
Compute an acceleration step
\[a^{(k)} = \operatorname{retr}_{p^{(k)}}\bigl( -β_k\operatorname{retr}^{-1}_{p^{(k)}}(p) \bigr)\]
where p^{(k)}
is the current iterate from the ProximalGradientMethodState
s field p
and the result is stored in state.a
. The field p
in this struct stores the last iterate.
The retraction and its inverse are taken from the state.
Fields
p
- the last iterateβ
- acceleration parameter function or valueinverse_retraction_method
- method for inverse retractionX
- tangent vector for computations
Constructor
ProximalGradientMethodAcceleration(M::AbstractManifold; kwargs...)
Generate the state for a given manifold M
with initial iterate p
.
Input
M::
AbstractManifold
: a Riemannian manifold $\mathcal M$
Keyword arguments
β = k -> (k-1)/(k+2)
- acceleration parameter function or valueinverse_retraction_method
- method for inverse retractionp
- initial pointX
- initial tangent vector
State
Manopt.ProximalGradientMethodState
— TypeProximalGradientMethodState <: AbstractManoptSolverState
State for the proximal_gradient_method
solver.
Fields
inverse_retraction_method::AbstractInverseRetractionMethod
: an inverse retraction $\operatorname{retr}^{-1}$ to use, see the section on retractions and their inversesa
- point after acceleration stepp::P
: a point on the manifold $\mathcal M$ storing the current iterateq
- point for storing gradient stepretraction_method::AbstractRetractionMethod
: a retraction $\operatorname{retr}$ to use, see the section on retractionsX
- tangent vector for storing gradientstop::StoppingCriterion
: a functor indicating that the stopping criterion is fulfilledacceleration
- a function(problem, state, k) -> state
to compute an acceleration before the gradient stepstepsize
- a function orStepsize
object to compute the stepsizelast_stepsize
- stores the last computed stepsizesub_problem::Union{AbstractManoptProblem, F}
: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.or nothing to take the proximal map from theManifoldProximalGradientObjective
sub_state::Union{AbstractManoptProblem, F}
: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.This field is ignored, if thesub_problem
isNothing
Constructor
ProximalGradientMethodState(M::AbstractManifold; kwargs...)
Generate the state for a given manifold M
with initial iterate p
.
Input
M::
AbstractManifold
: a Riemannian manifold $\mathcal M$
Keyword arguments
stepsize=default_stepsize(M, ProximalGradientMethodState)
inverse_retraction_method::AbstractInverseRetractionMethod
: an inverse retraction $\operatorname{retr}^{-1}$ to use, see the section on retractions and their inversesp=
rand
(M)
: a point on the manifold $\mathcal M$ to specify the initial valueretraction_method=
default_retraction_method
(M, typeof(p))
: a retraction $\operatorname{retr}$ to use, see the section on retractionsacceleration=(p, s, k) -> (copyto!(get_manifold(M), s.a, s.p); s)
by default no acceleration is performedstopping_criterion=
StopAfterIteration
(100)
: a functor indicating that the stopping criterion is fulfilledsub_problem=
nothing: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.sub_state=
AllocatingEvaluation
()
: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.X=
zero_vector
(M, p)
: a tangent vector at the point $p$ on the manifold $\mathcal M$to specify the representation of a tangent vector
Helping functions
Manopt.ProximalGradientNonsmoothSubgradient
— TypeProximalGradientNonsmoothSubgradient{F, R, P}
Stores a subgradient of the nonsmooth part h
of the proximal gradient objective f = g + h
, as well as the stepsize parameter $λ ∈ ℝ$.
This struct is also a functor in both formats * (M, p) -> X
to compute the gradient in allocating fashion. This is primarily used for computing a subgradient of the cost function h(q) + \frac{1}{2λ} d^2(q, p)
that defines proximal map in the proximal gradient method. This reads
\[ \partial h(q) - \frac{1}{λ} \operatorname{log}_q p\]
where p
is the proximity point where the proximal map is evaluated, i.e. the argument p
of the proximal map \operatorname{prox}_{λ} h
.
Fields
X::F
- the subgradient of the nonsmooth part of the total objective, i.e. the part of the objective whose proximal map is soughtλ::R
- the stepsize parameter for the proximal mapproximity_point::P
- point where the proximal map is evaluated, i.e. the argument of the proximal map that we want to solve for
Constructor
ProximalGradientNonsmoothSubgradient(cost, λ, proximity_point)
Manopt.ProximalGradientNonsmoothCost
— TypeProximalGradientNonsmoothCost{F, R, P}
Stores the nonsmooth part h
of the proximal gradient objective f = g + h
, as well as the stepsize parameter $λ ∈ ℝ$.
This struct is also a functor (M, q) -> v
that can be used as a cost function within a solver, primarily for solving the proximal map subproblem formulation in the proximal gradient method, which reads
\[ \operatorname{prox}_{λ} h(p) = \operatorname{argmin}_{q \in \mathcal M} \left( h(q) + \frac{1}{2λ} d^2(q, p) \right)\]
Hence, the functor reads
\[ (M, q) \mapsto h(q) + \frac{1}{2λ} d^2(q, p)\]
and p
is the proximity point where the proximal map is evaluated, i.e. the argument p
of the proximal map \operatorname{prox}_{λ} h
.
Fields
cost::F
- the nonsmooth parth
of the proximal gradient objective, i.e. the part of the objective whose proximal map is soughtλ::R
- the stepsize parameter for the proximal mapproximity_point::P
- point where the proximal map is evaluated, i.e. the argumentp
of the proximal map\operatorname{prox}_{λ} h
that we want to solve for
Constructor
ProximalGradientNonsmoothCost(cost, λ, proximity_point)
Stopping criteria
Manopt.StopWhenGradientMappingNormLess
— TypeStopWhenGradientMappingNormLess <: StoppingCriterion
A stopping criterion based on the gradient mapping norm for proximal gradient methods.
Fields
at_iteration::Int
: an integer indicating at which the stopping criterion last indicted to stop, which might also be before the solver started (0
). Any negative value indicates that this was not yet the case;last_change::Real
: the last change recorded in this stopping criterionthreshold
: the threshold for the change to check (run under to stop)
Constructor
StopWhenGradientMappingNormLess(ε)
Create a stopping criterion with threshold ε
for the gradient mapping for the proximal_gradient_method
. That is, this criterion indicates to stop when the gradient mapping has a norm less than ε
. The gradient mapping Gλ(p) is defined as -(1/λ) * logp(Tλ(p)), where Tλ(p) is the proximal mapping proxλ f(expp(-λ * grad f(p))).
Stepsize
Manopt.ProximalGradientMethodBacktracking
— FunctionProximalGradientMethodBacktracking(; kwargs...)
ProximalGradientMethodBacktracking(M::AbstractManifold; kwargs...)
Compute a stepsize for the proximal gradient method using a backtracking line search.
For the nonconvex case, the condition is:
\[f(p) - f(T_{λ}(p)) ≥ γλ\lVert G_{λ}(p) \rVert_{}^2\]
where G_{λ}(p) = (1/λ) * \log_p(T_{λ}(p))
is the gradient mapping.
For the convex case, the condition is:
\[g(T_{λ}(p)) ≤ g(p) + ⟨\operatorname{grad} g(p), \log_p T_{λ}(p)⟩ + \frac{1}{2λ} \mathrm{d}^2(p, T_{λ}(p))\]
Returns a stepsize λ
that satisfies the specified condition.
This function generates a ManifoldDefaultsFactory
for ProximalGradientMethodBacktrackingStepsize
. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState
is available.
Manopt.ProximalGradientMethodBacktrackingStepsize
— TypeProximalGradientMethodBacktrackingStepsize <: Stepsize
A functor for backtracking line search in proximal gradient methods.
Fields
initial_stepsize::T
- initial step size guesssufficient_decrease::T
- sufficient decrease parameter (default: 0.5)contraction_factor::T
- step size reduction factor (default: 0.5)strategy::Symbol
-:nonconvex
or:convex
(default::nonconvex
)candidate_point::P
- a working point used during backtrackinglast_stepsize::T
- the last computed stepsize
Constructor
ProximalGradientMethodBacktrackingStepsize(M::AbstractManifold; kwargs...)
Keyword arguments
initial_stepsize=1.0
: initial stepsize to trystop_when_stepsize_less=1e-8
: smallest stepsize when to stop (the last one before is taken)sufficient_decrease=0.5
: sufficient decrease parametercontraction_factor=0.5
: step size reduction factorstrategy=:nonconvex
: backtracking strategy, either:convex
or:nonconvex
Internal functions
Manopt.get_cost_smooth
— Functionget_cost_smooth(M::AbstractManifold, objective, p)
Helper function to extract the smooth part g
of a proximal gradient objective at the point p
.
Manopt.default_stepsize
— Methoddefault_stepsize(M::AbstractManifold, ::Type{<:ProximalGradientMethodState})
Returns the default proximal stepsize, which is a nonconvex backtracking strategy.
Literature
- [BJJP25]
- R. Bergmann, H. Jasa, P. John and M. Pfeffer. The Intrinsic Riemannian Proximal Gradient Method for Nonconvex Optimization, preprint (2025), arXiv:2506.09775.