next up previous contents
Next: 12.3 Recently Developed Analysis Up: 12.2.2 Empirical linear interpolation Previous: 12.2.2.2 Barnes analysis

12.2.2.3 OPTIMAL INTERPOLATION

The last of the older techniques for four-dimensional data assimilation is a statistical approach called optimum interpolation. Lev Gandin (1963) developed and popularized this least squares technique in meteorology, though Eliassen (1954) was first to suggest it. Optimal interpolation, or variations of it, is still used at major numerical prediction centers around the world.

To follow the computations, several definitions are required. Let:

A prime denotes an error quantity.

With the above definitions, the analysis equation can be written as

\begin{displaymath}a = b + \sum_i^n A_i(o_i - b_i) \end{displaymath} (12.16)
 

where Ai is an $l \times m$ matrix of weights. The background, b (defined on a grid) is interpolated to the position of the observation. This interpolation is not explicitly accounted for in the mathematics. If one subtracts the true value from both sides of the above equation, the error in the analysis is then defined as

\begin{displaymath}a' = b' + \sum_i^n A^i(o'^i - b'^i) \end{displaymath} (12.17)
 

In optimal interpolation, the weights are chosen so as to minimize the mean square error of the analysis or the diagonal elements of

E = <(a' - <a'>)(a' - <a'>)T (12.18)
 

with

\begin{displaymath}<a'> = <b'> + \sum_i^n A^i(<o'^i> - <b'^i>) \end{displaymath} (12.19)
 

where T means transposed, and the angle brackets denote an average.

By substituting for the definitions of a' and <a'>, E can be rewritten as

\begin{displaymath}E = F + \sum_j^n G^jA^{jT} + \sum_i^n A^iG^{iT} + \sum_i^n\sum_j^n A^iC^{ij}A^{jT} \end{displaymath} (12.20)
 

where

F = <(b'-<b'>)(b'-<b'>)T (12.21)
 

is a gridpoint covariance matrix,

Gj = <(b'-<b'>)[(o'j-<o'j>)-(b'j-<b'j>)]T (12.22)
 

is a gridpoint-to-observation covariance matrix,

GiT = <[(o'i-<o'i>)-(b'i-<b'i>)](b'-<b'>)T (12.23)
 

is the observation-to-gridpoint covariance matrix, and

Cij = <[(o'i-<o'i>)-(b'i-<b'i>)][(o'j-<o'j>) -(b'j-<b'j>)]T (12.24)
 

is the observation-to-observation covariance matrix. E can be minimized by taking partial derivatives with respect to elements of Ai and setting the results equal to zero. The solution is

\begin{displaymath}\pmatrix{A^1&A^2&\ldots&A^n} =-\pmatrix{G^1&G^2&\ldots&G^n}......\vdots&\ddots&\vdots\crC^{n1}&C^{n2}&\ldots&C^{nn}\cr}}^{-1} \end{displaymath} (12.25)
 

or, more compactly,

A = -GC-1  (12.26)
 

Given the weights A, E can be written as

E = F - GC-1GT  (12.27)
 

or

E = F + AGT  (12.28)
 

The covariance matrices C and G can be examined further with the help of the definitions given above.

 Cij = <(o'i-<o'i>)(o'j-<o'j>)T>    
            - <(o'i-<o'i>)(b'j-<b'j>)T>    
            - <(b'i-<b'i>)(o'j-<o'j>)T>   (12.29)
            + <(b'i-<b'i>)(b'j-<b'j>)T>    
 

The first term on the right-hand side is the covariance of the observed error, Oij. The next two terms, which are usually neglected, are the covariances between the observations and background errors. Finally, the last term is the covariance of background errors, Bij. Thus Cij can be approximated as

\begin{displaymath}C^{ij} \approx O^{ij} + B^{ij} \end{displaymath} (12.30)
 

Note that Oij is a diagonal matrix whenever different variables are measured by independent means or the same variables are measured independently at different locations. This is not always true, however, as in the case of meteorological satellites, where a single radiometer measures the upwelling radiation at many different locations.

Each Gj matrix (see again its definition, above) is the sum of two terms, a cross term (usually neglected) and a covariance of the background error between gridpoints and observation points j. Thus, Gj can be approximated as

\begin{displaymath}G^j \approx -B^{gj} \end{displaymath} (12.31)
 

where g reminds us that one point of the pair is a gridpoint.

Given the above results, we can rewrite the analysis equation as

a = b + Ar  (12.32)
 

where A is the $l\times mn$ matrix of weights and r is the column vector of residuals (i.e., o1-b1, o2-b2,...,on-bn). Substituting with the new definition of A, we get

a = b - GC-1r  (12.33)
 

or

a = b + Bg(B+O)-1r  (12.34)
 

and E can be rewritten as

E = F - ABgT  (12.35)
 

Without burdening ourselves with the math involved, we can illustrate the facility of optimal interpolation by considering several simple examples. In each example, we assume that 1) only one variable is analyzed, 2) observations of this variable are made independently at different locations, and 3) the background error (error in the model forecast) is spatially uniform.

First, consider a single observation made at the location of the grid point. Optimal interpolation says that the best analyzed value is a linear combination of observed and predicted (background) values, with the weights proportional to the inverse of the respective error variances. That is, the smaller the error variance for each value the greater the weight for that value.

The second example is taken from Daley (1991, p. 133), who discusses optimal interpolation in considerable detail. It involves two observations and a gridpoint lying along a straight line, as in Fig. 12.3:


Fig. 12.3:  Illustrating the geometry of a gridpoint and two observations, one fixed and one moving, all along a straight line.

Nondimensional distance along the line is measured by x/L, where L is a length scale on the order of hundreds of kilometers. The position of the first observation is fixed at x/L=-2; the gridpoint is at x/L=0. The second observation is free to move. The spatial correlation of background error is modeled by

\begin{displaymath}\rho(x)=\Bigl(1+{\vert x\vert\over L}\Bigr)exp\Bigl(1+{\vert x\vert\overL}\Bigr) \end{displaymath} (12.36)
 

The ratio of observation error variance to background error variance is set to 0.25, meaning that the accuracy of the observations is substantially greater than that of the forecast which provides the first guess. Given these conditions, Fig. 12.4 shows what happens as the second observation moves in the interval $-4\leq x/L \leq 4$. The curve marked W1 gives the weight applied to the fixed observation; the curve marked W2 gives the weight applied to the moving observation. The third curve ( $\varepsilon^2_A$) gives the normalized variance of analysis error.


Fig. 12.4 The analysis weights W1 and W2 for observation 1 (fixed at x/L=-2.0) and observation 2 (moving) as a function of the position of observation 2. The gridpoint lies at x/L=0.0. $\varepsilon^2_A$is the normalized error variance of the analysis.


Note that the observation closer to the gridpoint always has the larger weight. When the two observations coincide (at x/L=-2), their weights are equal. When they are equidistant from the gridpoint but on opposite sides (at $x/L=\pm 2$), their weights are again equal, but greater than when they coincide. The reason is that coincident observations give redundant information whereas observations on opposite sides of the gridpoint give independent information and should be more heavily weighted. When both observations are on the same side of the gridpoint, the one more distant from the gridpoint can take a negative weight. This results in extrapolation of the gradient implied by the two observations in the direction of the gridpoint.

The analysis error variance reaches a minimum when the moving observation is coincident with the gridpoint. Perhaps less intuitive is that the error variance curve is not symmetric about x/L=0. Close inspection reveals that ${\varepsilon^2_A(-{x\over L})} >{\varepsilon^2_A(+{x\over L})}$. Thus, two observations at fixed distances from the gridpoint are more valuable when they are in opposite directions from the gridpoint than when they are in the same direction.

The third example is a case of three perfect observations on a circle as shown in Fig. 12.5. Observations 2 and 3 are equidistant from the equator. By plugging reasonable correlation values into the equation for the weights, one finds that the weight for z(1) is larger than for z(2) or z(3). This occurs because z(1) carries more independent information than either z(2) or z(3), which are closer together. As the angle $\alpha$ increases toward $60^\circ$, the three weights approach equality.


 
Fig. 12.5:  Three observations on a circle with a gridpoint at the center. Observations 2 and 3 are both at angular distance $\alpha$from the equator. L is the radius.


The advantages of optimal interpolation include several shared by other techniques. For example: The following advantages are specific to optimal interpolation: The disadvantages of this method are:
next up previous contents
Next: 12.3 Recently Developed Analysis Up: 12.2.2 Empirical linear interpolation Previous: 12.2.2.2 Barnes analysis 

The NCAR Advanced Study Program
http://www.asp.ucar.edu