To install click the Add extension button. That's it.

The source code for the WIKI 2 extension is being checked by specialists of the Mozilla Foundation, Google, and Apple. You could also do it yourself at any point in time.

4,5
Kelly Slayton
Congratulations on this excellent venture… what a great idea!
Alexander Grigorievskiy
I use WIKI 2 every day and almost forgot how the original Wikipedia looks like.
Live Statistics
English Articles
Improved in 24 Hours
Added in 24 Hours
Languages
Recent
Show all languages
What we do. Every page goes through several hundred of perfecting techniques; in live mode. Quite the same Wikipedia. Just better.
.
Leo
Newton
Brights
Milds

Derivation of the conjugate gradient method

From Wikipedia, the free encyclopedia

In numerical linear algebra, the conjugate gradient method is an iterative method for numerically solving the linear system

where is symmetric positive-definite. The conjugate gradient method can be derived from several different perspectives, including specialization of the conjugate direction method[1] for optimization, and variation of the Arnoldi/Lanczos iteration for eigenvalue problems.

The intent of this article is to document the important steps in these derivations.

YouTube Encyclopedic

  • 1/5
    Views:
    9 244
    2 737
    55 198
    34 012
    103 341
  • Mod-01 Lec-33 Conjugate Gradient Method, Matrix Conditioning and Solutions
  • Mod-01 Lec-32 Optimization Based Methods for Solving Linear Algebraic Equations: Gradient Method
  • Linear regression (2): Gradient descent
  • Mod-06 Lec-13 Steepest Descent Method
  • Why the gradient is the direction of steepest ascent

Transcription

Derivation from the conjugate direction method

The conjugate gradient method can be seen as a special case of the conjugate direction method applied to minimization of the quadratic function

The conjugate direction method

In the conjugate direction method for minimizing

one starts with an initial guess and the corresponding residual , and computes the iterate and residual by the formulae

where are a series of mutually conjugate directions, i.e.,

for any .

The conjugate direction method is imprecise in the sense that no formulae are given for selection of the directions . Specific choices lead to various methods including the conjugate gradient method and Gaussian elimination.

Derivation from the Arnoldi/Lanczos iteration

The conjugate gradient method can also be seen as a variant of the Arnoldi/Lanczos iteration applied to solving linear systems.

The general Arnoldi method

In the Arnoldi iteration, one starts with a vector and gradually builds an orthonormal basis of the Krylov subspace

by defining where

In other words, for , is found by Gram-Schmidt orthogonalizing against followed by normalization.

Put in matrix form, the iteration is captured by the equation

where

with

When applying the Arnoldi iteration to solving linear systems, one starts with , the residual corresponding to an initial guess . After each step of iteration, one computes and the new iterate .

The direct Lanczos method

For the rest of discussion, we assume that is symmetric positive-definite. With symmetry of , the upper Hessenberg matrix becomes symmetric and thus tridiagonal. It then can be more clearly denoted by

This enables a short three-term recurrence for in the iteration, and the Arnoldi iteration is reduced to the Lanczos iteration.

Since is symmetric positive-definite, so is . Hence, can be LU factorized without partial pivoting into

with convenient recurrences for and :

Rewrite as

with

It is now important to observe that

In fact, there are short recurrences for and as well:

With this formulation, we arrive at a simple recurrence for :

The relations above straightforwardly lead to the direct Lanczos method, which turns out to be slightly more complex.

The conjugate gradient method from imposing orthogonality and conjugacy

If we allow to scale and compensate for the scaling in the constant factor, we potentially can have simpler recurrences of the form:

As premises for the simplification, we now derive the orthogonality of and conjugacy of , i.e., for ,

The residuals are mutually orthogonal because is essentially a multiple of since for , , for ,

To see the conjugacy of , it suffices to show that is diagonal:

is symmetric and lower triangular simultaneously and thus must be diagonal.

Now we can derive the constant factors and with respect to the scaled by solely imposing the orthogonality of and conjugacy of .

Due to the orthogonality of , it is necessary that . As a result,

Similarly, due to the conjugacy of , it is necessary that . As a result,

This completes the derivation.

References

  1. ^ Conjugate Direction Methods http://user.it.uu.se/~matsh/opt/f8/node5.html
  1. Hestenes, M. R.; Stiefel, E. (December 1952). "Methods of conjugate gradients for solving linear systems" (PDF). Journal of Research of the National Bureau of Standards. 49 (6): 409. doi:10.6028/jres.049.044.
  2. Saad, Y. (2003). "Chapter 6: Krylov Subspace Methods, Part I". Iterative methods for sparse linear systems (2nd ed.). SIAM. ISBN 978-0-89871-534-7.
This page was last edited on 19 March 2022, at 07:20
Basis of this page is in Wikipedia. Text is available under the CC BY-SA 3.0 Unported License. Non-text media are available under their specified licenses. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc. WIKI 2 is an independent company and has no affiliation with Wikimedia Foundation.