Divided differences
In mathematics, divided differences is an algorithm, historically used for computing tables of logarithms and trigonometric functions.Template:Citation needed Charles Babbage's difference engine, an early mechanical calculator, was designed to use this algorithm in its operation.<ref name="Isaacson2014">Template:Cite book</ref>
Divided differences is a recursive division process. Given a sequence of data points <math>(x_0, y_0), \ldots, (x_{n}, y_{n})</math>, the method calculates the coefficients of the interpolation polynomial of these points in the Newton form.
It is sometimes denoted by a delta with a bar: <math>\text{△} \!\!\!|\,\,</math> or <math>\text{◿} \! \text{◺}</math>.
Definition
Given n + 1 data points <math display="block">(x_0, y_0),\ldots,(x_{n}, y_{n})</math> where the <math>x_k</math> are assumed to be pairwise distinct, the forward divided differences are defined as: <math display="block">\begin{align} \mathopen[y_k] &:= y_k, && k \in \{ 0,\ldots,n\} \\ \mathopen[y_k,\ldots,y_j] &:= \frac{[y_{k+1},\ldots , y_j] - [y_{k},\ldots , y_{j-1}]}{x_j-x_k}, && k\in\{0,\ldots,n-1\},\ j\in\{k+1,\ldots,n\}. \end{align}</math>
To make the recursive process of computation clearer, the divided differences can be put in tabular form, where the columns correspond to the value of j above, and each entry in the table is computed from the difference of the entries to its immediate lower left and to its immediate upper left, divided by a difference of corresponding x-values: <math display="block"> \begin{matrix} x_0 & y_0 = [y_0] & & & \\
& & [y_0,y_1] & & \\
x_1 & y_1 = [y_1] & & [y_0,y_1,y_2] & \\
& & [y_1,y_2] & & [y_0,y_1,y_2,y_3]\\
x_2 & y_2 = [y_2] & & [y_1,y_2,y_3] & \\
& & [y_2,y_3] & & \\
x_3 & y_3 = [y_3] & & & \\ \end{matrix} </math>
Notation
Note that the divided difference <math>[y_k,\ldots,y_{k+j}]</math> depends on the values <math>x_k,\ldots,x_{k+j}</math> and <math>y_k,\ldots,y_{k+j}</math>, but the notation hides the dependency on the x-values. If the data points are given by a function f, <math display="block">(x_0, y_0), \ldots, (x_{k}, y_n) =(x_0, f(x_0)), \ldots, (x_n, f(x_n))</math> one sometimes writes the divided difference in the notation <math display="block">f[x_k,\ldots,x_{k+j}] \ \stackrel{\text{def}}= \ [f(x_k),\ldots,f(x_{k+j})] = [y_k,\ldots,y_{k+j}].</math>Other notations for the divided difference of the function ƒ on the nodes x0, ..., xn are: <math display="block">f[x_k,\ldots,x_{k+j}]=\mathopen[x_0,\ldots,x_n]f= \mathopen[x_0,\ldots,x_n;f]= D[x_0,\ldots,x_n]f.</math>
Example
Divided differences for <math>k=0</math> and the first few values of <math>j</math>: <math display="block"> \begin{align}
\mathopen[y_0] &= y_0 \\
\mathopen[y_0,y_1] &= \frac{y_1-y_0}{x_1-x_0} \\
\mathopen[y_0,y_1,y_2]
&= \frac{\mathopen[y_1,y_2]-\mathopen[y_0,y_1]}{x_2-x_0}
= \frac{\frac{y_2-y_1}{x_2-x_1}-\frac{y_1-y_0}{x_1-x_0}}{x_2-x_0}
= \frac{y_2-y_1}{(x_2-x_1)(x_2-x_0)}-\frac{y_1-y_0}{(x_1-x_0)(x_2-x_0)}
\\
\mathopen[y_0,y_1,y_2,y_3] &= \frac{\mathopen[y_1,y_2,y_3]-\mathopen[y_0,y_1,y_2]}{x_3-x_0}
\end{align} </math>
Thus, the table corresponding to these terms upto two columns has the following form: <math display="block">\begin{matrix}
x_0 & y_{0} & & \\
& & {y_{1}-y_{0}\over x_1 - x_0} & \\
x_1 & y_{1} & & {{y_{2}-y_{1}\over x_2 - x_1}-{y_{1}-y_{0}\over x_1 - x_0} \over x_2 - x_0} \\
& & {y_{2}-y_{1}\over x_2 - x_1} & \\
x_2 & y_{2} & & \vdots \\
& & \vdots & \\
\vdots & & & \vdots \\
& & \vdots & \\
x_n & y_{n} & & \\
\end{matrix} </math>
Properties
- Linearity <math display="block">\begin{align}
(f+g)[x_0,\dots,x_n] &= f[x_0,\dots,x_n] + g[x_0,\dots,x_n] \\ (\lambda\cdot f)[x_0,\dots,x_n] &= \lambda\cdot f[x_0,\dots,x_n] \end{align}</math>
- Leibniz rule <math display="block">(f\cdot g)[x_0,\dots,x_n] = f[x_0]\cdot g[x_0,\dots,x_n] + f[x_0,x_1]\cdot g[x_1,\dots,x_n] + \dots + f[x_0,\dots,x_n]\cdot g[x_n] = \sum_{r=0}^n f[x_0,\ldots,x_r]\cdot g[x_r,\ldots,x_n]</math>
- Divided differences are symmetric: If <math>\sigma : \{0, \dots, n\} \to \{0, \dots, n\}</math> is a permutation then <math display="block">f[x_0, \dots, x_n] = f[x_{\sigma(0)}, \dots, x_{\sigma(n)}]</math>
- Polynomial interpolation in the Newton form: if <math>P</math> is a polynomial function of degree <math>\leq n</math>, and <math>p[x_0, \dots , x_n]</math> is the divided difference, then <math display="block">P_{n-1}(x) = p[x_0] + p[x_0,x_1](x-x_0) + p[x_0,x_1,x_2](x-x_0)(x-x_1) + \cdots + p[x_0,\ldots,x_n] (x-x_0)(x-x_1)\cdots(x-x_{n-1})</math>
- If <math>p</math> is a polynomial function of degree <math><n</math>, then <math display="block">p[x_0, \dots, x_n] = 0.</math>
- Mean value theorem for divided differences: if <math>f</math> is n times differentiable, then <math display="block">f[x_0,\dots,x_n] = \frac{f^{(n)}(\xi)}{n!}</math> for a number <math>\xi</math> in the open interval determined by the smallest and largest of the <math>x_k</math>'s.
Matrix form
The divided difference scheme can be put into an upper triangular matrix: <math display="block">T_f(x_0,\dots,x_n)= \begin{pmatrix} f[x_0] & f[x_0,x_1] & f[x_0,x_1,x_2] & \ldots & f[x_0,\dots,x_n] \\ 0 & f[x_1] & f[x_1,x_2] & \ldots & f[x_1,\dots,x_n] \\ 0 & 0 & f[x_2] & \ldots & f[x_2,\dots,x_n] \\ \vdots & \vdots & & \ddots & \vdots \\ 0 & 0 & 0 & \ldots & f[x_n] \end{pmatrix}.</math>
Then it holds
- <math>T_{f+g}(x) = T_f(x) + T_g(x)</math>
- <math>T_{\lambda f}(x) = \lambda T_f(x)</math> if <math>\lambda</math> is a scalar
- <math>T_{f\cdot g}(x) = T_f(x) \cdot T_g(x)</math> Template:Pb This follows from the Leibniz rule. It means that multiplication of such matrices is commutative. Summarised, the matrices of divided difference schemes with respect to the same set of nodes x form a commutative ring.
- Since <math> T_f(x) </math> is a triangular matrix, its eigenvalues are obviously <math> f(x_0), \dots, f(x_n) </math>.
- Let <math>\delta_\xi</math> be a Kronecker delta-like function, that is <math display="block">\delta_\xi(t) = \begin{cases}
1 &: t=\xi , \\ 0 &: \mbox{else}. \end{cases}</math> Obviously <math>f\cdot \delta_\xi = f(\xi)\cdot \delta_\xi</math>, thus <math>\delta_\xi</math> is an eigenfunction of the pointwise function multiplication. That is <math>T_{\delta_{x_i}}(x)</math> is somehow an "eigenmatrix" of <math>T_f(x)</math>: <math> T_f(x) \cdot T_{\delta_{x_i}} (x) = f(x_i) \cdot T_{\delta_{x_i}}(x) </math>. However, all columns of <math>T_{\delta_{x_i}}(x)</math> are multiples of each other, the matrix rank of <math>T_{\delta_{x_i}}(x)</math> is 1. So you can compose the matrix of all eigenvectors of <math>T_f(x)</math> from the <math>i</math>-th column of each <math>T_{\delta_{x_i}}(x)</math>. Denote the matrix of eigenvectors with <math>U(x)</math>. Example <math display="block"> U(x_0,x_1,x_2,x_3) = \begin{pmatrix} 1 & \frac{1}{(x_1-x_0)} & \frac{1}{(x_2-x_0) (x_2-x_1)} & \frac{1}{(x_3-x_0) (x_3-x_1) (x_3-x_2)} \\ 0 & 1 & \frac{1}{(x_2-x_1)} & \frac{1}{(x_3-x_1) (x_3-x_2)} \\ 0 & 0 & 1 & \frac{1}{(x_3-x_2)} \\ 0 & 0 & 0 & 1 \end{pmatrix} </math> The diagonalization of <math>T_f(x)</math> can be written as <math display="block"> U(x) \cdot \operatorname{diag}(f(x_0),\dots,f(x_n)) = T_f(x) \cdot U(x) .</math>
Polynomials and power series
The matrix <math display="block"> J = \begin{pmatrix} x_0 & 1 & 0 & 0 & \cdots & 0 \\ 0 & x_1 & 1 & 0 & \cdots & 0 \\ 0 & 0 & x_2 & 1 & & 0 \\ \vdots & \vdots & & \ddots & \ddots & \\ 0 & 0 & 0 & 0 & \; \ddots & 1\\ 0 & 0 & 0 & 0 & & x_n \end{pmatrix} </math> contains the divided difference scheme for the identity function with respect to the nodes <math>x_0,\dots,x_n</math>, thus <math>J^m</math> contains the divided differences for the power function with exponent <math>m</math>. Consequently, you can obtain the divided differences for a polynomial function <math>p</math> by applying <math>p</math> to the matrix <math>J</math>: If <math display="block">p(\xi) = a_0 + a_1 \cdot \xi + \dots + a_m \cdot \xi^m</math> and <math display="block">p(J) = a_0 + a_1\cdot J + \dots + a_m\cdot J^m</math> then <math display="block">T_p(x) = p(J).</math> This is known as Opitz' formula.<ref>de Boor, Carl, Divided Differences, Surv. Approx. Theory 1 (2005), 46–69, [1]</ref><ref>Opitz, G. Steigungsmatrizen, Z. Angew. Math. Mech. (1964), 44, T52–T54</ref>
Now consider increasing the degree of <math>p</math> to infinity, i.e. turn the Taylor polynomial into a Taylor series. Let <math>f</math> be a function which corresponds to a power series. You can compute the divided difference scheme for <math>f</math> by applying the corresponding matrix series to <math>J</math>: If <math display="block">f(\xi) = \sum_{k=0}^\infty a_k \xi^k</math> and <math display="block">f(J)=\sum_{k=0}^\infty a_k J^k</math> then <math display="block">T_f(x)=f(J).</math>
Alternative characterizations
Expanded form
<math display="block"> \begin{align} f[x_0] &= f(x_0) \\ f[x_0,x_1] &= \frac{f(x_0)}{(x_0-x_1)} + \frac{f(x_1)}{(x_1-x_0)} \\ f[x_0,x_1,x_2] &= \frac{f(x_0)}{(x_0-x_1)\cdot(x_0-x_2)} + \frac{f(x_1)}{(x_1-x_0)\cdot(x_1-x_2)} + \frac{f(x_2)}{(x_2-x_0)\cdot(x_2-x_1)} \\ f[x_0,x_1,x_2,x_3] &= \frac{f(x_0)}{(x_0-x_1)\cdot(x_0-x_2)\cdot(x_0-x_3)} + \frac{f(x_1)}{(x_1-x_0)\cdot(x_1-x_2)\cdot(x_1-x_3)} +\\ &\quad\quad \frac{f(x_2)}{(x_2-x_0)\cdot(x_2-x_1)\cdot(x_2-x_3)} + \frac{f(x_3)}{(x_3-x_0)\cdot(x_3-x_1)\cdot(x_3-x_2)} \\ f[x_0,\dots,x_n] &= \sum_{j=0}^{n} \frac{f(x_j)}{\prod_{k\in\{0,\dots,n\}\setminus\{j\}} (x_j-x_k)} \end{align} </math>
With the help of the polynomial function <math>\omega(\xi) = (\xi-x_0) \cdots (\xi-x_n)</math> this can be written as <math display="block"> f[x_0,\dots,x_n] = \sum_{j=0}^{n} \frac{f(x_j)}{\omega'(x_j)}. </math>
Peano form
If <math>x_0<x_1<\cdots<x_n</math> and <math>n\geq 1</math>, the divided differences can be expressed as<ref>Template:Cite book</ref> <math display="block">f[x_0,\ldots,x_n] = \frac{1}{(n-1)!} \int_{x_0}^{x_n} f^{(n)}(t)\;B_{n-1}(t) \, dt</math> where <math>f^{(n)}</math> is the <math>n</math>-th derivative of the function <math>f</math> and <math>B_{n-1}</math> is a certain B-spline of degree <math>n-1</math> for the data points <math>x_0,\dots,x_n</math>, given by the formula <math display="block">B_{n-1}(t) = \sum_{k=0}^n \frac{(\max(0,x_k-t))^{n-1}}{\omega'(x_k)}</math>
This is a consequence of the Peano kernel theorem; it is called the Peano form of the divided differences and <math>B_{n-1}</math> is the Peano kernel for the divided differences, all named after Giuseppe Peano.
Forward and backward differences
When the data points are equidistantly distributed we get the special case called forward differences. They are easier to calculate than the more general divided differences.
Given n+1 data points <math display="block">(x_0, y_0), \ldots, (x_n, y_n)</math> with <math display="block">x_{k} = x_0 + k h,\ \text{ for } \ k=0,\ldots,n \text{ and fixed } h>0</math> the forward differences are defined as <math display="block">\begin{align} \Delta^{(0)} y_k &:= y_k,\qquad k=0,\ldots,n \\ \Delta^{(j)}y_k &:= \Delta^{(j-1)}y_{k+1} - \Delta^{(j-1)}y_k,\qquad k=0,\ldots,n-j,\ j=1,\dots,n. \end{align}</math>whereas the backward differences are defined as: <math display="block">\begin{align} \nabla^{(0)} y_k &:= y_k,\qquad k=0,\ldots,n \\ \nabla^{(j)}y_k &:= \nabla^{(j-1)}y_{k} - \nabla^{(j-1)}y_{k-1},\qquad k=0,\ldots,n-j,\ j=1,\dots,n. \end{align}</math> Thus the forward difference table is written as:<math display="block"> \begin{matrix} y_0 & & & \\
& \Delta y_0 & & \\
y_1 & & \Delta^2 y_0 & \\
& \Delta y_1 & & \Delta^3 y_0\\
y_2 & & \Delta^2 y_1 & \\
& \Delta y_2 & & \\
y_3 & & & \\ \end{matrix} </math>whereas the backwards difference table is written as:<math display="block"> \begin{matrix} y_0 & & & \\
& \nabla y_1 & & \\
y_1 & & \nabla^2 y_2 & \\
& \nabla y_2 & & \nabla^3 y_3\\
y_2 & & \nabla^2 y_3 & \\
& \nabla y_3 & & \\
y_3 & & & \\ \end{matrix} </math>
The relationship between divided differences and forward differences is<ref>Template:Cite book</ref> <math display="block">[y_j, y_{j+1}, \ldots , y_{j+k}] = \frac{1}{k!h^k}\Delta^{(k)}y_j, </math>whereas for backward differences:Template:Citation needed<math display="block">[{y}_{j}, y_{j-1},\ldots,{y}_{j-k}] = \frac{1}{k!h^k}\nabla^{(k)}y_j. </math>
Explicit formula
When the data points are equispaced we can also derive an explicit formula for <math>[y_j,\ldots,y_{j+k}]</math>. For any fixed <math>k \leq n</math> and <math>j</math> such that <math>j + k \leq n</math>, <math display="block"> [y_j,\ldots,y_{j+ k}] = \frac{(-1)^{j+k}}{k!h^k} \sum_{i=j}^{j+k} (-1)^i y_i \binom{k}{i-j}. </math>
Proof. We prove this by induction on <math>k</math>:
Base case: Let <math>k = 0</math> and <math>j = 0 \ldots n</math>. Then by definition <math>[y_j] = y_j</math>, and <math display="block"> \frac{(-1)^{j}}{0!h^0} \sum_{i=j}^{j} (-1)^i y_i \binom{0}{i-j} = (-1)^{2j} y_j = y_j. </math>
Induction step: Assume the above formula holds until <math>k</math> and consider <math>j</math> such that <math>j+k+1 \leq n</math>. By the recursive definition we have
<math display="block"> [y_j,\ldots,y_{j+k+1}] = \frac{1}{h(k+1)} \left ( [y_{j+1},\ldots,y_{j+k+1}] - [y_j,\ldots,y_{j+k}] \right ). </math>
We can use our inductive hypothesis on both members of the left side, and obtain
<math display="block"> \frac{1}{h(k+1)} \left ( \frac{(-1)^{j+k+1}}{k!h^k} \sum_{i=j+1}^{j+k+1} (-1)^i y_i \binom{k}{i-j} - \frac{(-1)^{j+k}}{k!h^k} \sum_{i=j}^{j+k} (-1)^i y_i \binom{k}{i-j} \right ). </math>
This can be rearranged as
<math display="block"> \frac{(-1)^{j+k+1}}{(k+1)!h^{k+1}} \sum_{i=j}^{j+k+1} (-1)^i y_i \left ( \binom{k}{i-j} + \binom{k}{i-j-1} \right ) </math>
where <math> \binom{k}{s} = 0 </math> when <math>s < k</math>. We obtain our thesis for <math> k+1 </math> by substituting the identity <math display="block"> \binom{k}{i - j} + \binom{k}{i - j - 1} = \binom{k + 1}{i - j}. </math>
See also
- Difference quotient
- Neville's algorithm
- Polynomial interpolation
- Mean value theorem for divided differences
- Nörlund–Rice integral
- Pascal's triangle
- Lagrange polynomial
References
External links
- An implementation in Haskell.
de:Polynominterpolation#Bestimmung der Koeffizienten: Schema der dividierten Differenzen