Frank—Wolfe method
Manopt.Frank_Wolfe_method
— FunctionFrank_Wolfe_method(M, f, grad_f, p)
Frank_Wolfe_method(M, gradient_objective, p; kwargs...)
Perform the Frank-Wolfe algorithm to compute for $\mathcal C \subset \mathcal M$
\[ \operatorname*{arg\,min}_{p∈\mathcal C} f(p),\]
where the main step is a constrained optimisation is within the algorithm, that is the sub problem (Oracle)
\[ q_k = \operatorname{arg\,min}_{q \in C} ⟨\operatorname{grad} F(p_k), \log_{p_k}q⟩.\]
for every iterate $p_k$ together with a stepsize $s_k≤1$, by default $s_k = \frac{2}{k+2}$. This algorithm is inspired by but slightly more general than Weber, Sra, Math. Prog., 2022.
The next iterate is then given by $p_{k+1} = γ_{p_k,q_k}(s_k)$, where by default $γ$ is the shortest geodesic between the two points but can also be changed to use a retraction and its inverse.
Input
M
– a manifold $\mathcal M$f
– a cost function $f: \mathcal M→ℝ$ to find a minimizer $p^*$ forgrad_f
– the gradient $\operatorname{grad}f: \mathcal M → T\mathcal M$ of f- as a function
(M, p) -> X
or a function(M, X, p) -> X
working in place ofX
.
- as a function
p
– an initial value $p ∈ \mathcal C$, note that it really has to be a feasible point
Alternatively to f
and grad_f
you can provide the AbstractManifoldGradientObjective
gradient_objective
directly.
Keyword Arguments
evaluation
- (AllocatingEvaluation
) whethergrad_f
is an inplace or allocating (default) functioninitial_vector
– (zero_vectoir(M,p)
) how to initialize the inner gradient tangent vectorstopping_criterion
– (StopAfterIteration
(500) |
StopWhenGradientNormLess
(1.0e-6)
) a stopping criterionretraction_method
– (default_retraction_method(M, typeof(p))
) a type of retractionstepsize
-(DecreasingStepsize
(; length=2.0, shift=2)
aStepsize
to use; but it has to be always less than 1. The default is the one proposed by Frank & Wolfe: $s_k = \frac{2}{k+2}$.sub_cost
- (FrankWolfeCost
(p, initiel_vector)
) – the cost of the Frank-Wolfe sub problem which by default uses the current iterate and (sub)gradient of the current iteration to define a default cost, this is used to define the defaultsub_objective
. It is ignored, if you set that or thesub_problem
directlysub_grad
- (FrankWolfeGradient
(p, initial_vector)
) – the gradient of the Frank-Wolfe sub problem which by default uses the current iterate and (sub)gradient of the current iteration to define a default gradient this is used to define the defaultsub_objective
. It is ignored, if you set that or thesub_problem
directlysub_objective
- (ManifoldGradientObjective
(sub_cost, sub_gradient)
) – the objective for the Frank-Wolfe sub problem this is used to define the defaultsub_problem
. It is ignored, if you set thesub_problem
manuallysub_problem
- (DefaultManoptProblem
(M, sub_objective)
) – the Frank-Wolfe sub problem to solve. This can be given in three forms- as an
AbstractManoptProblem
, then thesub_state
specifies the solver to use - as a closed form solution, e.g. a function, evaluating with new allocations, that is a function
(M, p, X) -> q
that solves the sub problem onM
given the current iteratep
and (sub)gradientX
. - as a closed form solution, e.g. a function, evaluating in place, that is a function
(M, q, p, X) -> q
working in place ofq
, with the parameters as in the last point
sub_state
has to be set to the correspondingAbstractEvaluationType
,AllocatingEvaluation
andInplaceEvaluation
, respectively- as an
sub_state
- (evaluation
ifsub_problem
is a function, a decoratedGradientDescentState
otherwise) for a function, the evaluation is inherited from the Frank-Wolfeevaluation
keyword.sub_kwargs
- ([]
) – keyword arguments to decorate thesub_state
default state in case the sub_problem is not a function
All other keyword arguments are passed to decorate_state!
for decorators or decorate_objective!
, respectively. If you provide the ManifoldGradientObjective
directly, these decorations can still be specified
Output
the obtained (approximate) minimizer $p^*$, see get_solver_return
for details
Manopt.Frank_Wolfe_method!
— FunctionFrank_Wolfe_method!(M, f, grad_f, p; kwargs...)
Frank_Wolfe_method!(M, gradient_objective, p; kwargs...)
Perform the Frank Wolfe method in place of p
.
For all options and keyword arguments, see Frank_Wolfe_method
.
State
Manopt.FrankWolfeState
— TypeFrankWolfeState <: AbstractManoptSolverState
A struct to store the current state of the Frank_Wolfe_method
It comes in two forms, depending on the realisation of the subproblem
.
Fields
p
– the current iterate, i.e. a point on the manifoldX
– the current gradient $\operatorname{grad} F(p)$, i.e. a tangent vector top
.inverse_retraction_method
– (default_inverse_retraction_method(M, typeof(p))
) an inverse retraction method to use within Frank Wolfe.sub_problem
– anAbstractManoptProblem
problem or a function(M, p, X) -> q
or(M, q, p, X)
for the a closed form solution of the sub problemsub_state
– anAbstractManoptSolverState
for the subsolver or anAbstractEvaluationType
in case the sub problem is provided as a functionstop
– (StopAfterIteration
(200) |
StopWhenGradientNormLess
(1.0e-6)
) aStoppingCriterion
stepsize
- (DecreasingStepsize
(; length=2.0, shift=2)
) $s_k$ which by default is set to $s_k = \frac{2}{k+2}$.retraction_method
– (default_retraction_method(M, typeof(p))
) a retraction to use within Frank-Wolfe
For the subtask, we need a method to solve
\[ \operatorname*{argmin}_{q∈\mathcal M} ⟨X, \log_p q⟩,\qquad \text{ where }X=\operatorname{grad} f(p)\]
Constructor
FrankWolfeState(M, p, X, sub_problem, sub_state)
where the remaining fields from above are keyword arguments with their defaults already given in brackets.
Helpers
For the inner sub-problem you can easily create the corresponding cost and gradient using
Manopt.FrankWolfeCost
— TypeFrankWolfeCost{P,T}
A structure to represent the oracle sub problem in the Frank_Wolfe_method
. The cost function reads
\[F(q) = ⟨X, \log_p q⟩\]
The values p
and X
are stored within this functor and should be references to the iterate and gradient from within FrankWolfeState
.
Manopt.FrankWolfeGradient
— TypeFrankWolfeGradient{P,T}
A structure to represent the gradient of the oracle sub problem in the Frank_Wolfe_method
, that is for a given point p
and a tangent vector X
we have
\[F(q) = ⟨X, \log_p q⟩\]
Its gradient can be computed easily using adjoint_differential_log_argument
.
The values p
and X
are stored within this functor and should be references to the iterate and gradient from within FrankWolfeState
.
- [WS22]
- M. Weber and S. Sra. Riemannian Optimization via Frank-Wolfe Methods. Mathematical Programming 199, 525–556 (2022).