Gradient Sampling Algorithm
Manopt.gradient_sampling — Function
gradient_sampling(M, f, grad_f, p=rand(M); kwargs...)
gradient_sampling(M, gradient_objective, p=rand(M); kwargs...)
gradient_sampling!(M, f, grad_f, p; kwargs...)
gradient_sampling!(M, gradient_objective, p; kwargs...)perform the gradient sampling algorithm as introduced in [HU17].
The algorithm samples a set of sampling_size =$m$ many points in a ball around the current iterate, evaluates the gradient at these points and transports these to the current iterate. It then builds a surrogate in the tangent space consisting of these $m$ tangent vectors and the gradient at the current iterate to determine a new descent direction in the convex hull of these. See gradient_sampling_subsolver for the actual sub problem that is solved. If this direction exceeds, the subgradient_norm_tolerance the step is rejected and both the ball radius and this tolerance are reduced.
Input
M::AbstractManifold: a Riemannian manifold $\mathcal{M}$f: a cost function $f: \mathcal{M}→ ℝ$ implemented as(M, p) -> vgrad_f: the (Riemannian) gradient $\operatorname{grad}f: \mathcal{M} → T_{p}\mathcal{M}$ of f as a function(M, p) -> Xor a function(M, X, p) -> XcomputingXin-placep::P: a point on the manifold $\mathcal{M}$
Alternatively to f and grad_f you can provide the corresponding AbstractManifoldFirstOrderObjective gradient_objective directly.
Keyword arguments
differential = nothing: specify a specific function to evaluate the differential. By default, $Df(p)[X] = ⟨\operatorname{grad}f(p),X⟩$. is usedevaluation::AbstractEvaluationType=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second. For examplegrad_f(M,p)allocates, butgrad_f!(M, X, p)computes the result in-place ofX.retraction_method::AbstractRetractionMethod=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionssample_size =manifold_dimension(M)+1set the number of sampling points. If you initialisesampled_points,sampled_vectors, andconvex_hull_coeffsdirectly this parameter has no effect.sampling_radius = 0.5sampling_radius_reduction = 0.5sampling_radius_threshold = 1.0e-2a threshold $ϵ_{\mathrm{opt}}$ to be used in the stopping criterionstepsize::Stepsize=default_stepsize(M,GradientSamplingState; retraction_method=retraction_method): a functor inheriting fromStepsizeto determine a step sizestopping_criterion::StoppingCriterion=StopAfterIteration(200)|(StopWhenGradientNormLess(subgradient_norm_threshold)&StopWhenSmallerOrEqual(:sampling_radius, sampling_radius_threshold)): a functor indicating that the stopping criterion is fulfilledsubgradient_norm_reduction = 0.5subgradient_norm_tolerance = 0.1subgradient_norm_threshold = 1.0e-3a threshold $δ_{\mathrm{opt}}$ to be used in the stopping criterionvector_transport_method::AbstractVectorTransportMethod=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transportsX::T =zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal{M}$ storing the gradient at the current iterate
Manopt.gradient_sampling! — Function
gradient_sampling(M, f, grad_f, p=rand(M); kwargs...)
gradient_sampling(M, gradient_objective, p=rand(M); kwargs...)
gradient_sampling!(M, f, grad_f, p; kwargs...)
gradient_sampling!(M, gradient_objective, p; kwargs...)perform the gradient sampling algorithm as introduced in [HU17].
The algorithm samples a set of sampling_size =$m$ many points in a ball around the current iterate, evaluates the gradient at these points and transports these to the current iterate. It then builds a surrogate in the tangent space consisting of these $m$ tangent vectors and the gradient at the current iterate to determine a new descent direction in the convex hull of these. See gradient_sampling_subsolver for the actual sub problem that is solved. If this direction exceeds, the subgradient_norm_tolerance the step is rejected and both the ball radius and this tolerance are reduced.
Input
M::AbstractManifold: a Riemannian manifold $\mathcal{M}$f: a cost function $f: \mathcal{M}→ ℝ$ implemented as(M, p) -> vgrad_f: the (Riemannian) gradient $\operatorname{grad}f: \mathcal{M} → T_{p}\mathcal{M}$ of f as a function(M, p) -> Xor a function(M, X, p) -> XcomputingXin-placep::P: a point on the manifold $\mathcal{M}$
Alternatively to f and grad_f you can provide the corresponding AbstractManifoldFirstOrderObjective gradient_objective directly.
Keyword arguments
differential = nothing: specify a specific function to evaluate the differential. By default, $Df(p)[X] = ⟨\operatorname{grad}f(p),X⟩$. is usedevaluation::AbstractEvaluationType=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second. For examplegrad_f(M,p)allocates, butgrad_f!(M, X, p)computes the result in-place ofX.retraction_method::AbstractRetractionMethod=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionssample_size =manifold_dimension(M)+1set the number of sampling points. If you initialisesampled_points,sampled_vectors, andconvex_hull_coeffsdirectly this parameter has no effect.sampling_radius = 0.5sampling_radius_reduction = 0.5sampling_radius_threshold = 1.0e-2a threshold $ϵ_{\mathrm{opt}}$ to be used in the stopping criterionstepsize::Stepsize=default_stepsize(M,GradientSamplingState; retraction_method=retraction_method): a functor inheriting fromStepsizeto determine a step sizestopping_criterion::StoppingCriterion=StopAfterIteration(200)|(StopWhenGradientNormLess(subgradient_norm_threshold)&StopWhenSmallerOrEqual(:sampling_radius, sampling_radius_threshold)): a functor indicating that the stopping criterion is fulfilledsubgradient_norm_reduction = 0.5subgradient_norm_tolerance = 0.1subgradient_norm_threshold = 1.0e-3a threshold $δ_{\mathrm{opt}}$ to be used in the stopping criterionvector_transport_method::AbstractVectorTransportMethod=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transportsX::T =zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal{M}$ storing the gradient at the current iterate
State
Manopt.GradientSamplingState — Type
GradientSamplingStateA state for the gradient sampling algorithm. The mathematical symbols are adapted from [HU17]
Fields
convex_hull_coefficients<:AbstractVector{R}store the solution vector of the sub problem, i.e. the coefficients of the result in the convex hullp::P: a point on the manifold $\mathcal{M}$ storing the current iteratesampled_points<:AbstractVector{P}memory to store the vector of sampled pointssampled_vectors<:AbstractVector{T}memory to store the vector of (transported) gradientssampling_radiusradius $ϵ_k$ of the ball around the iterate to sample fromsampling_radius_reductionfactor $θ_ϵ$ to reduce the sampling radius when rejecting a stepsubgradient_norm_reductionfactor $θ_δ$ to reduce the subgradient norm tolerance when rejecting a stepsubgradient_norm_tolerancebound $δ_k$ to reject too small gradient vector solutions from the sub problemstop::StoppingCriterion: a functor indicating that the stopping criterion is fulfilledstepsize::Stepsize: a functor inheriting fromStepsizeto determine a step sizesub_problem::Union{AbstractManoptProblem, F}: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.sub_state::Union{AbstractManoptProblem, F}: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.retraction_method::AbstractRetractionMethod: a retraction $\operatorname{retr}$ to use, see the section on retractionsvector_transport_method::AbstractVectorTransportMethod: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transportsX::T: a tangent vector at the point $p$ on the manifold $\mathcal{M}$ storing the gradient at the current iterateY::T: a tangent vector at the point $p$ on the manifold $\mathcal{M}$ a tangent vector to assemble the solution in.
Constructor
GradientSamplingState(
M::AbstractManifold, sub_problem = gradient_sampling_subsolver!, sub_state = InplaceEvaluation(); kwargs...
)Create a gradient sampling solver state
Input
M::AbstractManifold: a Riemannian manifold $\mathcal{M}$sub_problem::Union{AbstractManoptProblem, F}: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.sub_state::Union{AbstractManoptProblem, F}: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.
Keyword arguments
p::P =rand(M): a point on the manifold $\mathcal{M}$ to specify the initial valueretraction_method::AbstractRetractionMethod=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionssample_size = 5set the number of sampling points. If you initialisesampled_points,sampled_vectors, andconvex_hull_coeffsdirectly this parameter has no effect.sampling_radius = 0.5sampling_radius_reduction = 0.5sampling_radius_threshold = 1.0e-2a threshold $ϵ_{\mathrm{opt}}$ to be used in the stopping criterionstepsize::Stepsize=default_stepsize(M,GradientSamplingState; retraction_method=retraction_method): a functor inheriting fromStepsizeto determine a step sizestopping_criterion::StoppingCriterion=StopAfterIteration(200)|(StopWhenGradientNormLess(subgradient_norm_threshold)&StopWhenSmallerOrEqual(:sampling_radius, sampling_radius_threshold)): a functor indicating that the stopping criterion is fulfilledsubgradient_norm_reduction = 0.5subgradient_norm_tolerance = 0.1subgradient_norm_threshold = 1.0e-3a threshold $δ_{\mathrm{opt}}$ to be used in the stopping criterionvector_transport_method::AbstractVectorTransportMethod=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transportsX::T =zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal{M}$ to specify the representation of a tangent vector
Helpers and internal functions
Manopt.gradient_sampling_subsolver — Function
λ = gradient_sampling_subsolver(M, p, sampled_gradients)
gradient_sampling_subsolver!(M, λ, p, sampled_gradients)solver for the subproblem of the gradient_sampling algorithm.
Let $Y_j$, $j=0,…m$ denote the sampled_gradients already provided transported to the tangent space at p
The subproblem then reads
\[\begin{align*} \operatorname*{arg\,min}_{λ ∈ ℝ^{m+1}} & \Bigr\lVert \sum_{j = 0}^{m} λ_j Y_{j}\Bigr\rVert^2 \\ \text{s. t.} \quad & \sum_{j = 0}^{m} λ_j = 1, \quad λ_j ≥ 0 \quad \text{for all } j =0,…,m. \end{align*}\]
A default subsolver based on RipQP.jl and QuadraticModels is available if these two packages are loaded.
Technical details
The gradient_sampling solver requires the following functions of a manifold to be available
- A
retract!(M, q, p, X); it is recommended to set thedefault_retraction_methodto a favourite retraction. If this default is set, aretraction_method=does not have to be specified. - By default gradient sampling uses
ArmijoLinesearchwhich requiresmax_stepsize(M)to be set and an implementation ofinner(M, p, X). - By default the stopping criterion uses the
normas well, to stop when the norm of the gradient is small, but if you implementedinner(M, p, X), the norm is provided already. - By default the tangent vector storing the gradient is initialized calling
zero_vector(M,p). - A
vector_transport_to!M, Y, p, X, q); it is recommended to set thedefault_vector_transport_methodto a favourite retraction. If this default is set, avector_transport_method=does not have to be specified.
Literature
- [HU17]
- S. Hosseini and A. Uschmajew. A Riemannian Gradient Sampling Algorithm for Nonsmooth Optimization on Manifolds. SIAM Journal on Optimization 27, 173–189 (2017).