Second order objectives
Manopt.AbstractManifoldHessianObjective — Type
AbstractManifoldHessianObjective{E<:AbstractEvaluationType,F, G, H} <: AbstractManifoldFirstOrderObjective{E,Tuple{F,G}}An abstract type for all objectives that provide a (full) Hessian, where T is a AbstractEvaluationType for the gradient and Hessian functions.
Manopt.ManifoldHessianObjective — Type
ManifoldHessianObjective{T<:AbstractEvaluationType,C,G,H,Pre} <: AbstractManifoldHessianObjective{T,C,G,H}specify a problem for Hessian based algorithms.
Fields
cost: a function $f:\mathcal{M}→ℝ$ to minimizegradient: the gradient $\operatorname{grad}f:\mathcal{M} → T\mathcal{M}$ of the cost function $f$hessian: the Hessian $\operatorname{Hess}f(x)[⋅]: T_{x}\mathcal{M} → T_{x}\mathcal{M}$ of the cost function $f$preconditioner: the symmetric, positive definite preconditioner as an approximation of the inverse of the Hessian of $f$, a map with the same input variables as thehessianto numerically stabilize iterations when the Hessian is ill-conditioned
Depending on the AbstractEvaluationType T the gradient and can have to forms
- as a function
(M, p) -> Xand(M, p, X) -> Y, resp., anAllocatingEvaluation - as a function
(M, X, p) -> Xand (M, Y, p, X), resp., anInplaceEvaluation
Constructor
ManifoldHessianObjective(f, grad_f, Hess_f, preconditioner = (M, p, X) -> X;
evaluation=AllocatingEvaluation())See also
Access functions
Manopt.get_hessian — Function
Y = get_hessian(amp::AbstractManoptProblem{T}, p, X)
get_hessian!(amp::AbstractManoptProblem{T}, Y, p, X)evaluate the Hessian of an AbstractManoptProblem amp at p applied to a tangent vector X, computing $\operatorname{Hess}f(q)[X]$, which can also happen in-place of Y.
get_hessian(M::AbstractManifold, vgf::VectorHessianFunction, p, X, i)
get_hessian(M::AbstractManifold, vgf::VectorHessianFunction, p, X, i, range)
get_hessian!(M::AbstractManifold, X, vgf::VectorHessianFunction, p, X, i)
get_hessian!(M::AbstractManifold, X, vgf::VectorHessianFunction, p, X, i, range)Evaluate the Hessians of the vector function vgf on the manifold M at p in direction X and the values given in range, specifying the representation of the gradients.
Since i is assumed to be a linear index, you can provide
- a single integer
- a
UnitRangeto specify a range to be returned like1:3 - a
BitVectorspecifying a selection - a
AbstractVector{<:Integer}to specify indices :to return the vector of all Hessian evaluations
get_Hessian(TpM::TangentSpace, slso::SymmetricLinearSystemObjective, X, V)
get_Hessian!(TpM::TangentSpace, W, slso::SymmetricLinearSystemObjective, X, V)evaluate the Hessian of
\[f(X) = \frac{1}{2} \lVert \mathcal{A}[X] + b \rVert_{p}^2,\qquad X ∈ T_{p}\mathcal{M},\]
Which is $\operatorname{Hess} f(X)[Y] = \mathcal{A}[V]$. This can be computed in-place of W. Internally this (just) calls the get_linear_operator function.
get_hessian(TpM, trmo::TrustRegionModelObjective, X)Evaluate the Hessian of the TrustRegionModelObjective
\[\operatorname{Hess} m(X)[Y] = \operatorname{Hess} f(p)[Y].\]
get_hessian(M::AbstractManifold, lmsco::LevenbergMarquardtLinearSurrogateObjective, p, X, Y)
get_hessian!(M::AbstractManifold, Z, lmsco::LevenbergMarquardtLinearSurrogateObjective, p, X, Y)Compute the Hessian of the LevenbergMarquardtLinearSurrogateObjective, which is given by
\[\begin{aligned} \operatorname{Hess} μ_p(X)[Y] &= \sum_{i=1}^{m} \mathcal{L}_i^*\bigl(\mathcal{L}_i(Y)\bigr) + λY\\\\ &= \sum_{i=1}^{m} J_{F_i}^*(p)\Bigl[ ρ_i' \bigl(I- b F_i(p)F_i(p)^{\mathrm{T}}\bigr)^2 J_{F_i}(p)[Y] + λY \Bigr] \end{aligned} \]
where $ρ_i' = ρ_i'(\lVert F_i(p) \rVert_2^2)$, $ρ_i'' = ρ_i''(\lVert F_i(p) \rVert_2^2)$ are the values from the AbstractRobustifierFunction ρ its first and second derivative, respectively, and $b$ is the get_LevenbergMarquardt_scaling values of scaling the operator. See also get_jacobian and get_adjoint_jacobian.
This can be computed inplace of Z.
get_hessian(M::AbstractManifold, emo::EmbeddedManifoldObjective, p, X)
get_hessian!(M::AbstractManifold, Y, emo::EmbeddedManifoldObjective, p, X)Evaluate the Hessian of an objective defined in the embedding, that is embed p and X before calling the Hessian function stored in the EmbeddedManifoldObjective.
The returned Hessian is then converted to a Riemannian Hessian calling riemannian_Hessian.
get_hessian(M::AbstractManifold, scaled_objective::ScaledManifoldObjective, p, X)
get_hessian!(M::AbstractManifold, Y, scaled_objective::ScaledManifoldObjective, p, X)Evaluate the scaled Hessian $s*\operatorname{Hess}f(p)$
Manopt.get_hessian! — Function
get_hessian(M::AbstractManifold, lmsco::LevenbergMarquardtLinearSurrogateObjective, p, X, Y)
get_hessian!(M::AbstractManifold, Z, lmsco::LevenbergMarquardtLinearSurrogateObjective, p, X, Y)Compute the Hessian of the LevenbergMarquardtLinearSurrogateObjective, which is given by
\[\begin{aligned} \operatorname{Hess} μ_p(X)[Y] &= \sum_{i=1}^{m} \mathcal{L}_i^*\bigl(\mathcal{L}_i(Y)\bigr) + λY\\\\ &= \sum_{i=1}^{m} J_{F_i}^*(p)\Bigl[ ρ_i' \bigl(I- b F_i(p)F_i(p)^{\mathrm{T}}\bigr)^2 J_{F_i}(p)[Y] + λY \Bigr] \end{aligned} \]
where $ρ_i' = ρ_i'(\lVert F_i(p) \rVert_2^2)$, $ρ_i'' = ρ_i''(\lVert F_i(p) \rVert_2^2)$ are the values from the AbstractRobustifierFunction ρ its first and second derivative, respectively, and $b$ is the get_LevenbergMarquardt_scaling values of scaling the operator. See also get_jacobian and get_adjoint_jacobian.
This can be computed inplace of Z.
Manopt.get_preconditioner — Function
get_preconditioner(amp::AbstractManoptProblem, p, X)evaluate the symmetric, positive definite preconditioner (approximation of the inverse of the Hessian of the cost function f) of a AbstractManoptProblem amps objective at the point p applied to a tangent vector X.
get_preconditioner(M::AbstractManifold, mho::ManifoldHessianObjective, p, X)evaluate the symmetric, positive definite preconditioner (approximation of the inverse of the Hessian of the cost function F) of a ManifoldHessianObjective mho at the point p applied to a tangent vector X.
and internally
Manopt.get_hessian_function — Function
get_hessian_function(amgo::ManifoldHessianObjective{E<:AbstractEvaluationType})return the function to evaluate (just) the Hessian $\operatorname{Hess} f(p)$. Depending on the AbstractEvaluationType E this is a function
(M, p, X) -> Yfor theAllocatingEvaluationcase(M, Y, p, X) -> Xfor theInplaceEvaluation, working in-place ofY.
Approximation of the Hessian
Several different methods to approximate the Hessian are available.
Manopt.ApproxHessianFiniteDifference — Type
ApproxHessianFiniteDifference{E, P, T, G, RTR, VTR, R <: Real} <: AbstractApproxHessianA functor to approximate the Hessian by a finite difference of gradient evaluation.
Given a point p and a direction X and the gradient $\operatorname{grad} f(p)$ of a function $f$ the Hessian is approximated as follows: let $c$ be a stepsize, $X ∈ T_{p}\mathcal{M}$ a tangent vector and $q = \operatorname{retr}_p(\frac{c}{\lVert X \rVert_p}X)$ be a step in direction $X$ of length $c$ following a retraction Then the Hessian is approximated by the finite difference of the gradients, where $\mathcal T_{⋅←⋅}$ is a vector transport.
\[\operatorname{Hess}f(p)[X] ≈ \frac{\lVert X \rVert}{c}\Bigl( \mathcal T_{q←p}\bigl( \operatorname{grad}f(q)\bigr - \operatorname{grad}f(p) \Bigr)\]
Fields
gradient!!: the gradient function (either allocating or mutating, seeevaluationparameter)step_length: a step length for the finite differenceretraction_method::AbstractRetractionMethod=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsvector_transport_method::AbstractVectorTransportMethod=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Internal temporary fields
grad_tmp: a temporary storage for the gradient at the currentpgrad_dir_tmp: a temporary storage for the gradient at the currentp_dirp_dir::P: a temporary storage to the forward direction (or the $q$ in the formula)
Constructor
ApproximateFiniteDifference(M, p, grad_f; kwargs...)Keyword arguments
evaluation::AbstractEvaluationType=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.steplength=2^{-14}: step length $c$ to approximate the gradient evaluationsretraction_method::AbstractRetractionMethod=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsvector_transport_method::AbstractVectorTransportMethod=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Manopt.ApproxHessianSymmetricRankOne — Type
ApproxHessianSymmetricRankOne{E, P, G, T, B<:AbstractBasis{ℝ}, VTR, R<:Real} <: AbstractApproxHessianA functor to approximate the Hessian by the symmetric rank one update.
Fields
gradient!!: the gradient function (either allocating or mutating, seeevaluationparameter).ν: a small real number to ensure that the denominator in the update does not become too small and thus the method does not break down.vector_transport_method::AbstractVectorTransportMethod=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports.
Internal temporary fields
p_tmp: a temporary storage the current pointp.grad_tmp: a temporary storage for the gradient at the currentp.matrix: a temporary storage for the matrix representation of the approximating operator.basis: a temporary storage for an orthonormal basis at the currentp.
Constructor
ApproxHessianSymmetricRankOne(M, p, gradF; kwargs...)Keyword arguments
initial_operator=Matrix{Float64}(I, manifold_dimension(M), manifold_dimension(M))) the matrix representation of the initial approximating operator.basis=DefaultOrthonormalBasisan orthonormal basis in the tangent space of the initial iterate p.nu(-1)evaluation::AbstractEvaluationType=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.vector_transport_method::AbstractVectorTransportMethod=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Manopt.ApproxHessianBFGS — Type
ApproxHessianBFGS{E, P, G, T, B<:AbstractBasis{ℝ}, VTR, R<:Real} <: AbstractApproxHessianA functor to approximate the Hessian by the BFGS update.
Fields
gradient!!the gradient function (either allocating or mutating, seeevaluationparameter).scalevector_transport_method::AbstractVectorTransportMethod: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Internal temporary fields
p_tmpa temporary storage the current pointp.grad_tmpa temporary storage for the gradient at the currentp.matrixa temporary storage for the matrix representation of the approximating operator.basisa temporary storage for an orthonormal basis at the currentp.
Constructor
ApproxHessianBFGS(M, p, gradF; kwargs...)Keyword arguments
initial_operator(Matrix{Float64}(I, manifold_dimension(M), manifold_dimension(M))) the matrix representation of the initial approximating operator.basis=DefaultOrthonormalBasis) an orthonormal basis in the tangent space of the initial iterate p.nu(-1)evaluation::AbstractEvaluationType=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.vector_transport_method::AbstractVectorTransportMethod=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports