# Optimize on your own manifold

Ronny Bergmann

When you have used a few solvers from `Manopt.jl`

for example like in the opening tutorial 🏔️ Get started: optimize! and also familiarized yourself with how to work with manifolds in general at 🚀 Get Started with `Manifolds.jl`

, you might come across the point that you want to implementing a manifold yourself and use it within `Manopt.jl`

. A challenge might be, which functions are necessary, since the overall interface of `ManifoldsBase.jl`

is maybe not completely necessary.

This tutorial aims to help you through these steps to implement necessary parts of a manifold to get started with the solver you have in mind.

## An example problem

We get started by loading the packages we need.

```
using LinearAlgebra, Manifolds, ManifoldsBase, Random
using Manopt
Random.seed!(42)
```

We also define the same manifold as in the implementing a manifold tutorial.

```
"""
ScaledSphere <: AbstractManifold{ℝ}
Define a sphere of fixed radius
# Fields
* `dimension` dimension of the sphere
* `radius` the radius of the sphere
# Constructor
ScaledSphere(dimension,radius)
Initialize the manifold to a certain `dimension` and `radius`,
which by default is set to `1.0`
"""
struct ScaledSphere <: AbstractManifold{ℝ}
dimension::Int
radius::Float64
end
```

We would like to compute a mean and/or median similar to 🏔️ Get started: optimize!. For given a set of points $q_1,\ldots,q_n$ we want to compute [Kar77]

\[ \operatorname*{arg\,min}_{p∈\mathcal M} \frac{1}{2n} \sum_{i=1}^n d_{\mathcal M}^2(p, q_i)\]

On the `ScaledSphere`

we just defined above. We define a few parameters first

```
d = 5 # dimension of the sphere - embedded in R^{d+1}
r = 2.0 # radius of the sphere
N = 100 # data set size
M = ScaledSphere(d,r)
```

`ScaledSphere(5, 2.0)`

If we generate a few points

```
# generate 100 points around the north pole
pts = [ [zeros(d)..., M.radius] .+ 0.5.*([rand(d)...,0.5] .- 0.5) for _=1:N]
# project them onto the r-sphere
pts = [ r/norm(p) .* p for p in pts]
```

Then, before starting with optimization, we need the distance on the manifold, to define the cost function, as well as the logarithmic map to defined the gradient. For both, we here use the “lazy” approach of using the Sphere as a fallback. Finally, we have to provide information about how points and tangent vectors are stored on the manifold by implementing their `representation_size`

function, which is often required when allocating memory. While we could

```
import ManifoldsBase: distance, log, representation_size
function distance(M::ScaledSphere, p, q)
return M.radius * distance(Sphere(M.dimension), p ./ M.radius, q ./ M.radius)
end
function log(M::ScaledSphere, p, q)
return M.radius * log(Sphere(M.dimension), p ./ M.radius, q ./ M.radius)
end
representation_size(M::ScaledSphere) = (M.dimension+1,)
```

## Define the cost and gradient

```
f(M, q) = sum(distance(M, q, p)^2 for p in pts)
grad_f(M,q) = sum( - log(M, q, p) for p in pts)
```

## Defining the necessary functions to run a solver

The documentation usually lists the necessary functions in a section “Technical Details” close to the end of the documentation of a solver, for our case that is The gradient descent’s Technical Details,

They list all details, but we can start even step by step here if we are a bit careful.

### A retraction

We first implement a retraction. Informally, given a current point and a direction to “walk into” we need a function that performs that walk. Since we take an easy one that just projects onto the sphere, we use the `ProjectionRetraction`

type. To be precise, we have to implement the in-place variant `retract_project!`

```
import ManifoldsBase: retract_project!
function retract_project!(M::ScaledSphere, q, p, X, t::Number)
q .= p .+ t .* X
q .*= M.radius / norm(q)
return q
end
```

`retract_project! (generic function with 18 methods)`

The other two technical remarks refer to the step size and the stopping criterion, so if we set these to something simpler, we should already be able to do a first run.

We have to specify

- that we want to use the new retraction,
- a simple step size and stopping criterion

We start with a certain point of cost

```
p0 = [zeros(d)...,1.0]
f(M,p0)
```

`444.60374551157634`

Then we can run our first solver, where we have to overwrite a few defaults, which would use functions we do not (yet) have. We will discuss these in the next steps.

```
q1 = gradient_descent(M, f, grad_f, p0;
retraction_method = ProjectionRetraction(), # state, that we use the retraction from above
stepsize = DecreasingStepsize(M; length=1.0), # A simple step size
stopping_criterion = StopAfterIteration(10), # A simple stopping crtierion
X = zeros(d+1), # how we define/represent tangent vectors
)
f(M,q1)
```

`162.4000287847332`

We at least see, that the function value decreased.

### Norm and maximal step size.

To use more advanced stopping criteria and step sizes we first need an`inner`

`(M, p, X)`

. We also need a `max_stepsize`

`(M)`

, to avoid having too large steps on positively curved manifolds like our scaled sphere in this example

```
import ManifoldsBase: inner
import Manopt: max_stepsize
inner(M::ScaledSphere, p, X,Y) = dot(X,Y) # inherited from the embedding
# set the maximal allowed stepsize to injectivity radius.
Manopt.max_stepsize(M::ScaledSphere) = M.radius*π
```

Then we can use the default step size (`ArmijoLinesearch`

) and the default stopping criterion, which checks for a small gradient Norm

```
q2 = gradient_descent(M, f, grad_f, p0;
retraction_method = ProjectionRetraction(), # as before
X = zeros(d+1), # as before
)
f(M, q2)
```

`9.772830131357024`

### Making life easier: default retraction and zero vector

To initialize tangent vector memory, the function `zero_vector`

`(M,p)`

is called. Similarly, the most-used retraction is returned by `default_retraction_method`

We can use both here, to make subsequent calls to the solver less verbose. We define

```
import ManifoldsBase: zero_vector, default_retraction_method
zero_vector(M::ScaledSphere, p) = zeros(M.dimension+1)
default_retraction_method(M::ScaledSphere) = ProjectionRetraction()
```

`default_retraction_method (generic function with 20 methods)`

and now we can even just call

```
q3 = gradient_descent(M, f, grad_f, p0)
f(M, q3)
```

`9.772830131357024`

But we for example automatically also get the possibility to obtain debug information like

`gradient_descent(M, f, grad_f, p0; debug = [:Iteration, :Cost, :Stepsize, 25, :GradientNorm, :Stop, "\n"]);`

```
Initial f(x): 444.603746
# 25 f(x): 9.772836s:0.01851892669284175|grad f(p)|:0.027709997662907597
# 50 f(x): 9.772830s:0.01851892669284175|grad f(p)|:0.00035050862606229735
# 75 f(x): 9.772830s:0.01671333134028968|grad f(p)|:3.5186467891885363e-6
The algorithm reached approximately critical point after 81 iterations; the gradient norm (8.510619620990863e-9) is less than 1.0e-8.
```

see How to Print Debug Output for more details.

## Literature

- [Kar77]
- H. Karcher.
*Riemannian center of mass and mollifier smoothing*. Communications on Pure and Applied Mathematics**30**, 509–541 (1977).