This is all literature mentioned / referenced in the Manopt.jl documenation. Usually you will find a small reference section at the end of every documentation page that contains references.

P.-A. Absil, C. Baker and K. Gallivan. Trust-Region Methods on Riemannian Manifolds. Foundations of Computational Mathematics 7, 303–330 (2006).
P.-A. Absil, R. Mahony and R. Sepulchre. Optimization Algorithms on Matrix Manifolds. Princeton University Press (2008). [open access](
S. Adachi, T. Okuno and A. Takeda. Riemannian Levenberg-Marquardt Method with Global and Local Convergence Properties. ArXiv Preprint (2022).
N. Agarwal, N. Boumal, B. Bullins and C. Cartis. Adaptive regularization with cubics on manifolds. Mathematical Programming (2020).
Y. T. Almeida, J. X. Cruz Neto, P. R. Oliveira and J. C. Oliveira Souza. A modified proximal point method for DC functions on Hadamard manifolds. Computational Optimization and Applications 76, 649–673 (2020).
M. Bačák. Computing medians and means in Hadamard spaces. SIAM Journal on Optimization 24, 1542–1566 (2014), arXiv: [1210.2145](
M. Bačák, R. Bergmann, G. Steidl and A. Weinmann. A second order non-smooth variational model for restoring manifold-valued images. SIAM Journal on Scientific Computing 38, A567–A597 (2016), arxiv: [1506.02409](
E. M. Beale. A derivation of conjugate gradients. In: Numerical methods for nonlinear optimization, 39–43, London, Academic Press, London (1972).
R. Bergmann, O. P. Ferreira, E. M. Santos and J. C. Souza. The difference of convex algorithm on Hadamard manifolds, arXiv preprint (2023).
R. Bergmann and P.-Y. Gousenbourger. A variational model for data fitting on manifolds by minimizing the acceleration of a Bézier curve. Frontiers in Applied Mathematics and Statistics 4 (2018), arXiv: [1807.10090](
R. Bergmann and R. Herzog. Intrinsic formulation of KKT conditions and constraint qualifications on smooth manifolds. SIAM Journal on Optimization 29, 2423–2444 (2019), arXiv: [1804.06214](
R. Bergmann, R. Herzog, M. Silva Louzeiro, D. Tenbrinck and J. Vidal-Núñez. Fenchel duality theory and a primal-dual algorithm on Riemannian manifolds. Foundations of Computational Mathematics 21, 1465–1504 (2021), arXiv: [1908.02022](
R. Bergmann, F. Laus, G. Steidl and A. Weinmann. Second order differences of cyclic data and applications in variational denoising. SIAM Journal on Imaging Sciences 7, 2916–2953 (2014), arxiv: [1405.5349](
R. Bergmann, J. Persch and G. Steidl. A parallel Douglas Rachford algorithm for minimizing ROF-like functionals on images with values in symmetric Hadamard manifolds. SIAM Journal on Imaging Sciences 9, 901–937 (2016), arxiv: [1512.02814](
P. B. Borckmans, M. Ishteva and P.-A. Absil. A Modified Particle Swarm Optimization Algorithm for the Best Low Multilinear Rank Approximation of Higher-Order Tensors. In: 7th International Conference on Swarm INtelligence, editors, 13–23. Springer Berlin Heidelberg (2010).
N. Boumal. An Introduction to Optimization on Smooth Manifolds. Cambridge University Press (2023).
M. P. do Carmo. Riemannian Geometry. Birkhäuser Boston, Inc., Boston, MA (1992).
P. de Casteljau. Outillage methodes calcul. Enveloppe Soleau 40.040, Institute National de la Propriété Industrielle, Paris. (1959).
P. de Casteljau. Courbes et surfaces à pôles. Microfiche P 4147-1, Institute National de la Propriété Industrielle, Paris. (1963).
A. Chambolle and T. Pock. A first-order primal-dual algorithm for convex problems with applications to imaging. Journal of Mathematical Imaging and Vision 40, 120–145 (2011).
A. R. Conn, N. I. Gould and P. L. Toint. Trust Region Methods. Society for Industrial and Applied Mathematics (2000).
Y. H. Dai and Y. Yuan. A Nonlinear Conjugate Gradient Method with a Strong Global Convergence Property. SIAM Journal on Optimization 10, 177–182 (1999).
W. Diepeveen and J. Lellmann. An Inexact Semismooth Newton Method on Riemannian Manifolds with Application to Duality-Based Total Variation Denoising. SIAM Journal on Imaging Sciences 14, 1565–1600 (2021), arXiv: [2102.10309](
J. Duran, M. Moeller, C. Sbert and D. Cremers. Collaborative Total Variation: A General Framework for Vectorial TV Models. SIAM Journal on Imaging Sciences 9, 116-151 (2016), arxiv: [1508.01308](
P. T. Fletcher. Geodesic regression and the theory of least squares on Riemannian manifolds. International Journal of Computer Vision 105, 171–185 (2013).
R. Fletcher. Practical Methods of Optimization. John Wiley & Sons Ltd. (1987).
R. Fletcher and C. M. Reeves. Function minimization by conjugate gradients. The Computer Journal 7, 149–154 (1964).
G. N. Grapiglia and G. F. Stella. An Adaptive Riemannian Gradient Method Without Function Evaluations. Journal of Optimization Theory and Applications 197, 1140–1160 (2023), preprint: [](
W. W. Hager and H. Zhang. A survey of nonlinear conjugate gradient methods. Pacific Journal of Optimization 2, 35–58 (2006).
W. W. Hager and H. Zhang. A New Conjugate Gradient Method with Guaranteed Descent and an Efficient Line Search. SIAM Journal on Optimization 16, 170–192 (2005).
M. Hestenes and E. Stiefel. Methods of conjugate gradients for solving linear systems. Journal of Research of the National Bureau of Standards 49, 409 (1952).
W. Huang. Optimization algorithms on Riemannian manifolds with applications. Phd thesis, Flordia State University (2014).
W. Huang, P.-A. Absil and K. A. Gallivan. A Riemannian BFGS method without differentiated retraction for nonconvex optimization problems. SIAM Journal on Optimization 28, 470–495 (2018).
W. Huang, K. A. Gallivan and P.-A. Absil. A Broyden class of quasi-Newton methods for Riemannian optimization. SIAM Journal on Optimization 25, 1660–1685 (2015).
B. Iannazzo and M. Porcelli. The Riemannian Barzilai{\textendash}Borwein method with nonmonotone line search and the matrix geometric mean computation. IMA Journal of Numerical Analysis 38, 495–517 (2017).
H. Karcher. Riemannian center of mass and mollifier smoothing. Communications on Pure and Applied Mathematics 30, 509–541 (1977).
F. Laus, M. Nikolova, J. Persch and G. Steidl. A nonlocal denoising algorithm for manifold-valued images using second order statistics. SIAM Journal on Imaging Sciences 10, 416–448 (2017).
Y. Liu and C. Storey. Efficient generalized conjugate gradient algorithms, part 1: Theory. Journal of Optimization Theory and Applications 69, 129–137 (1991).
D. Nguyen. Operator-Valued Formulas for Riemannian Gradient and Hessian and Families of Tractable Metrics in Riemannian Optimization. Journal of Optimization Theory and Applications 198, 135–164 (2023), arXiv:2009.10159.
J. Nocedal and S. J. Wright. Numerical Optimization. Springer, New York (2006).
R. Peeters. On a Riemannian version of the Levenberg-Marquardt algorithm. Technical Report 0011, VU University Amsterdam, Faculty of Economics, Business Administration and Econometrics (1993).
E. Polak and G. Ribière. Note sur la convergence de méthodes de directions conjuguées. Revue française d’informatique et de recherche opérationnelle 3, 35–43 (1969).
B. T. Polyak. The conjugate gradient method in extremal problems. USSR Computational Mathematics and Mathematical Physics 9, 94–112 (1969).
T. Popiel and L. Noakes. Bézier curves and $C^2$ interpolation in Riemannian manifolds. Journal of Approximation Theory 148, 111–127 (2007).
M. J. Powell. Restart procedures for the conjugate gradient method. Mathematical Programming 12, 241–254 (1977).
J. C. Souza and P. R. Oliveira. A proximal point algorithm for DC fuctions on Hadamard manifolds. Journal of Global Optimization 63, 797–810 (2015).
M. Weber and S. Sra. Riemannian Optimization via Frank-Wolfe Methods. Mathematical Programming 199, 525–556 (2022).
H. Zhang and S. Sra. Towards Riemannian accelerated gradient methods, arXiv Preprint, 1806.02812 (2018).