Alternating gradient descent

alternating_gradient_descent(M::ProductManifold, f, grad_f, p=rand(M))
alternating_gradient_descent(M::ProductManifold, ago::ManifoldAlternatingGradientObjective, p)

perform an alternating gradient descent


  • M: the product manifold $\mathcal M = \mathcal M_1 × \mathcal M_2 × ⋯ ×\mathcal M_n$
  • f: the objective function (cost) defined on M.
  • grad_f: a gradient, that can be of two cases
    • is a single function returning an ArrayPartition or
    • is a vector functions each returning a component part of the whole gradient
  • p: an initial value $p_0 ∈ \mathcal M$


  • evaluation: (AllocatingEvaluation) specify whether the gradients work by allocation (default) form gradF(M, x) or InplaceEvaluation in place of the form gradF!(M, X, x) (elementwise).
  • evaluation_order: (:Linear) whether to use a randomly permuted sequence (:FixedRandom), a per cycle permuted sequence (:Random) or the default :Linear one.
  • inner_iterations: (5) how many gradient steps to take in a component before alternating to the next
  • stopping_criterion: (StopAfterIteration(1000)) a StoppingCriterion
  • stepsize: (ArmijoLinesearch()) a Stepsize
  • order: ([1:n]) the initial permutation, where n is the number of gradients in gradF.
  • retraction_method: (default_retraction_method(M, typeof(p))) a retraction(M, p, X) to use.


usually the obtained (approximate) minimizer, see get_solver_return for details


The input of each of the (component) gradients is still the whole vector X, just that all other then the ith input component are assumed to be fixed and just the ith components gradient is computed / returned.

alternating_gradient_descent!(M::ProductManifold, f, grad_f, p)
alternating_gradient_descent!(M::ProductManifold, ago::ManifoldAlternatingGradientObjective, p)

perform a alternating gradient descent in place of p.


  • M: a product manifold $\mathcal M$
  • f: the objective functioN (cost)
  • grad_f: a gradient function, that either returns a vector of the subgradients or is a vector of gradients
  • p: an initial value $p_0 ∈ \mathcal M$

you can also pass a ManifoldAlternatingGradientObjective ago containing f and grad_f instead.

for all optional parameters, see alternating_gradient_descent.



AlternatingGradientDescentState <: AbstractGradientDescentSolverState

Store the fields for an alternating gradient descent algorithm, see also alternating_gradient_descent.


  • direction: (AlternatingGradient(zero_vector(M, x)) a DirectionUpdateRule
  • evaluation_order: (:Linear) whether to use a randomly permuted sequence (:FixedRandom), a per cycle newly permuted sequence (:Random) or the default :Linear evaluation order.
  • inner_iterations: (5) how many gradient steps to take in a component before alternating to the next
  • order the current permutation
  • retraction_method: (default_retraction_method(M, typeof(p))) a retraction(M,x,ξ) to use.
  • stepsize: (ConstantStepsize(M)) a Stepsize
  • stopping_criterion: (StopAfterIteration(1000)) a StoppingCriterion
  • p: the current iterate
  • X: (zero_vector(M,p)) the current gradient tangent vector
  • k, ì`: internal counters for the outer and inner iterations, respectively.


AlternatingGradientDescentState(M, p; kwargs...)

Generate the options for point p and where inner_iterations, order_type, order, retraction_method, stopping_criterion, and stepsize` are keyword arguments


Additionally, the options share a DirectionUpdateRule, which chooses the current component, so they can be decorated further; The most inner one should always be the following one though.

AlternatingGradient <: DirectionUpdateRule

The default gradient processor, which just evaluates the (alternating) gradient on one of the components


Technical details

The alternating_gradient_descent solver requires the following functions of a manifold to be available

alternate between parts of the input.

  • A retract!(M, q, p, X); it is recommended to set the default_retraction_method to a favourite retraction. If this default is set, a retraction_method= does not have to be specified.
  • By default alternating gradient descent uses ArmijoLinesearch which requires max_stepsize(M) to be set and an implementation of inner(M, p, X).
  • By default the tangent vector storing the gradient is initialized calling zero_vector(M,p).