Gradient Sampling Algorithm

Manopt.gradient_sampling — Function

gradient_sampling(M, f, grad_f, p=rand(M); kwargs...)
gradient_sampling(M, gradient_objective, p=rand(M); kwargs...)
gradient_sampling!(M, f, grad_f, p; kwargs...)
gradient_sampling!(M, gradient_objective, p; kwargs...)

perform the gradient sampling algorithm as introduced in [HU17].

The algorithm samples a set of sampling_size =$m$ many points in a ball around the current iterate, evaluates the gradient at these points and transports these to the current iterate. It then builds a surrogate in the tangent space consisting of these $m$ tangent vectors and the gradient at the current iterate to determine a new descent direction in the convex hull of these. See gradient_sampling_subsolver for the actual sub problem that is solved. If this direction exceeds, the subgradient_norm_tolerance the step is rejected and both the ball radius and this tolerance are reduced.

Input

M::AbstractManifold: a Riemannian manifold $\mathcal{M}$
f: a cost function $f: \mathcal{M}→ ℝ$ implemented as (M, p) -> v
grad_f: the (Riemannian) gradient $\operatorname{grad}f: \mathcal{M} → T_{p}\mathcal{M}$ of f as a function (M, p) -> X or a function (M, X, p) -> X computing X in-place
p::P: a point on the manifold $\mathcal{M}$

Alternatively to f and grad_f you can provide the corresponding AbstractManifoldFirstOrderObjective gradient_objective directly.

Keyword arguments

callbacks::D = Dict{Symbol,Function}(): provided callback functions$A single function `(symbol, problem, state, k)` called in every hook, a (vector of) pairs `:hook => function` will be processed by [$ ,YOu can either pass an array of Pairsassumed to be stored in a dictionary like structure with symbols as keys
differential = nothing: specify a specific function to evaluate the differential. By default, $Df(p)[X] = ⟨\operatorname{grad}f(p),X⟩$. is used
evaluation::AbstractEvaluationType=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second. For example grad_f(M,p) allocates, but grad_f!(M, X, p) computes the result in-place of X.
retraction_method::AbstractRetractionMethod=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
sample_size =manifold_dimension(M)+1 set the number of sampling points. If you initialise sampled_points, sampled_vectors, and convex_hull_coeffs directly this parameter has no effect.
sampling_radius = 0.5
sampling_radius_reduction = 0.5
sampling_radius_threshold = 1.0e-2 a threshold $ϵ_{\mathrm{opt}}$ to be used in the stopping criterion
stepsize::Stepsize=default_stepsize(M,GradientSamplingState; retraction_method=retraction_method): a functor inheriting from Stepsize to determine a step size
stopping_criterion::StoppingCriterion=StopAfterIteration(200)|(StopWhenGradientNormLess(subgradient_norm_threshold)&StopWhenSmallerOrEqual(:sampling_radius, sampling_radius_threshold)): a functor indicating that the stopping criterion is fulfilled
subgradient_norm_reduction = 0.5
subgradient_norm_tolerance = 0.1
subgradient_norm_threshold = 1.0e-3 a threshold $δ_{\mathrm{opt}}$ to be used in the stopping criterion
vector_transport_method::AbstractVectorTransportMethod=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
X::T =zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal{M}$ storing the gradient at the current iterate

source

Manopt.gradient_sampling! — Function

gradient_sampling(M, f, grad_f, p=rand(M); kwargs...)
gradient_sampling(M, gradient_objective, p=rand(M); kwargs...)
gradient_sampling!(M, f, grad_f, p; kwargs...)
gradient_sampling!(M, gradient_objective, p; kwargs...)

perform the gradient sampling algorithm as introduced in [HU17].

Input

M::AbstractManifold: a Riemannian manifold $\mathcal{M}$
f: a cost function $f: \mathcal{M}→ ℝ$ implemented as (M, p) -> v
grad_f: the (Riemannian) gradient $\operatorname{grad}f: \mathcal{M} → T_{p}\mathcal{M}$ of f as a function (M, p) -> X or a function (M, X, p) -> X computing X in-place
p::P: a point on the manifold $\mathcal{M}$

Alternatively to f and grad_f you can provide the corresponding AbstractManifoldFirstOrderObjective gradient_objective directly.

Keyword arguments

callbacks::D = Dict{Symbol,Function}(): provided callback functions$A single function `(symbol, problem, state, k)` called in every hook, a (vector of) pairs `:hook => function` will be processed by [$ ,YOu can either pass an array of Pairsassumed to be stored in a dictionary like structure with symbols as keys
differential = nothing: specify a specific function to evaluate the differential. By default, $Df(p)[X] = ⟨\operatorname{grad}f(p),X⟩$. is used
evaluation::AbstractEvaluationType=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second. For example grad_f(M,p) allocates, but grad_f!(M, X, p) computes the result in-place of X.
retraction_method::AbstractRetractionMethod=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
sample_size =manifold_dimension(M)+1 set the number of sampling points. If you initialise sampled_points, sampled_vectors, and convex_hull_coeffs directly this parameter has no effect.
sampling_radius = 0.5
sampling_radius_reduction = 0.5
sampling_radius_threshold = 1.0e-2 a threshold $ϵ_{\mathrm{opt}}$ to be used in the stopping criterion
stepsize::Stepsize=default_stepsize(M,GradientSamplingState; retraction_method=retraction_method): a functor inheriting from Stepsize to determine a step size
stopping_criterion::StoppingCriterion=StopAfterIteration(200)|(StopWhenGradientNormLess(subgradient_norm_threshold)&StopWhenSmallerOrEqual(:sampling_radius, sampling_radius_threshold)): a functor indicating that the stopping criterion is fulfilled
subgradient_norm_reduction = 0.5
subgradient_norm_tolerance = 0.1
subgradient_norm_threshold = 1.0e-3 a threshold $δ_{\mathrm{opt}}$ to be used in the stopping criterion
vector_transport_method::AbstractVectorTransportMethod=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
X::T =zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal{M}$ storing the gradient at the current iterate

source

State

Manopt.GradientSamplingState — Type

GradientSamplingState

A state for the gradient sampling algorithm. The mathematical symbols are adapted from [HU17]

Fields

callbacks::D: provided callback functions`` provided as a dictionary symbols as keys
convex_hull_coefficients<:AbstractVector{R} store the solution vector of the sub problem, i.e. the coefficients of the result in the convex hull
p::P: a point on the manifold $\mathcal{M}$ storing the current iterate
sampled_points<:AbstractVector{P} memory to store the vector of sampled points
sampled_vectors<:AbstractVector{T} memory to store the vector of (transported) gradients
sampling_radius radius $ϵ_k$ of the ball around the iterate to sample from
sampling_radius_reduction factor $θ_ϵ$ to reduce the sampling radius when rejecting a step
subgradient_norm_reduction factor $θ_δ$ to reduce the subgradient norm tolerance when rejecting a step
subgradient_norm_tolerance bound $δ_k$ to reject too small gradient vector solutions from the sub problem
stop::StoppingCriterion: a functor indicating that the stopping criterion is fulfilled
stepsize::Stepsize: a functor inheriting from Stepsize to determine a step size
sub_problem::Union{AbstractManoptProblem, F}: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.
sub_state::Union{AbstractManoptProblem, F}: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.
retraction_method::AbstractRetractionMethod: a retraction $\operatorname{retr}$ to use, see the section on retractions
vector_transport_method::AbstractVectorTransportMethod: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
X::T: a tangent vector at the point $p$ on the manifold $\mathcal{M}$ storing the gradient at the current iterate
Y::T: a tangent vector at the point $p$ on the manifold $\mathcal{M}$ a tangent vector to assemble the solution in.

Constructor

GradientSamplingState(
    M::AbstractManifold, sub_problem = gradient_sampling_subsolver!, sub_state = InplaceEvaluation(); kwargs...
)

Create a gradient sampling solver state

Input

M::AbstractManifold: a Riemannian manifold $\mathcal{M}$
sub_problem::Union{AbstractManoptProblem, F}: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.
sub_state::Union{AbstractManoptProblem, F}: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.

Keyword arguments

callbacks::D = Dict{Symbol,Function}(): provided callback functions`` provided as a dictionary symbols as keys
p::P =rand(M): a point on the manifold $\mathcal{M}$ to specify the initial value
retraction_method::AbstractRetractionMethod=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
sample_size = 5 set the number of sampling points. If you initialise sampled_points, sampled_vectors, and convex_hull_coeffs directly this parameter has no effect.
sampling_radius = 0.5
sampling_radius_reduction = 0.5
sampling_radius_threshold = 1.0e-2 a threshold $ϵ_{\mathrm{opt}}$ to be used in the stopping criterion
stepsize::Stepsize=default_stepsize(M,GradientSamplingState; retraction_method=retraction_method): a functor inheriting from Stepsize to determine a step size
stopping_criterion::StoppingCriterion=StopAfterIteration(200)|(StopWhenGradientNormLess(subgradient_norm_threshold)&StopWhenSmallerOrEqual(:sampling_radius, sampling_radius_threshold)): a functor indicating that the stopping criterion is fulfilled
subgradient_norm_reduction = 0.5
subgradient_norm_tolerance = 0.1
subgradient_norm_threshold = 1.0e-3 a threshold $δ_{\mathrm{opt}}$ to be used in the stopping criterion
vector_transport_method::AbstractVectorTransportMethod=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
X::T =zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal{M}$ to specify the representation of a tangent vector

source

Helpers and internal functions

Manopt.gradient_sampling_subsolver — Function

λ = gradient_sampling_subsolver(M, p, sampled_gradients)
gradient_sampling_subsolver!(M, λ, p, sampled_gradients)

solver for the subproblem of the gradient_sampling algorithm.

Let $Y_j$, $j=0,…m$ denote the sampled_gradients already provided transported to the tangent space at p

The subproblem then reads

\[\begin{align*} \operatorname*{arg\,min}_{λ ∈ ℝ^{m+1}} & \Bigr\lVert \sum_{j = 0}^{m} λ_j Y_{j}\Bigr\rVert^2 \\ \text{s. t.} \quad & \sum_{j = 0}^{m} λ_j = 1, \quad λ_j ≥ 0 \quad \text{for all } j =0,…,m. \end{align*}\]

Tip

A default subsolver based on RipQP.jl and QuadraticModels is available if these two packages are loaded.

source

Technical details

The gradient_sampling solver requires the following functions of a manifold to be available

A retract!(M, q, p, X); it is recommended to set the default_retraction_method to a favourite retraction. If this default is set, a retraction_method= does not have to be specified.
By default gradient sampling uses ArmijoLinesearch which requires max_stepsize(M) to be set and an implementation of inner(M, p, X).
By default the stopping criterion uses the norm as well, to stop when the norm of the gradient is small, but if you implemented inner(M, p, X), the norm is provided already.
By default the tangent vector storing the gradient is initialized calling zero_vector(M,p).
A vector_transport_to!M, Y, p, X, q); it is recommended to set the default_vector_transport_method to a favourite retraction. If this default is set, a vector_transport_method= does not have to be specified.

Literature

[HU17]: S. Hosseini and A. Uschmajew. A Riemannian Gradient Sampling Algorithm for Nonsmooth Optimization on Manifolds. SIAM Journal on Optimization 27, 173–189 (2017).