next up previous contents
Next: 6.3 Linear regression with Up: 6. Linear Regression Analysis Previous: 6.1 Simple linear regression

6.2 Effects of measurement errors

In the preceding section, we assumed that the correlation between variables was the result of a physical relationship, and ignored the possible effects of measurement uncertainties. However, measurement errors will tend to obscure the true correlation, especially if there are correlations among the measurement errors. If the measurement uncertainty is large compared to the true range of variation in a variable, it may be difficult to determine the true correlation coefficient.

In most cases the measurement errors are not correlated with fluctuations in the values being measured. In this case, the observed covariance matrix is just the sum of the true covariance matrix and the covariance matrix describing the measurement errors:

\begin{displaymath}{\bf H}^{-1}_{observed} = {\bf H}_{natural}^{-1} + {\bfH}_{measurement}^{-1} ~ . \end{displaymath} (6.30)
 

To show this, let x* and y* be observed values and let x and y be the true values, so that the respective measurement errors in x and y are

u = x*-x  (6.31)
  
v = y*-y (6.32)
 

Then the observed covariance has the expectation value

\begin{displaymath}V_{x^*y^*} = \langle(x^*-\overline{x^*})(y^*-\overline{y^*})\rangle \end{displaymath} (6.33)
  
\begin{displaymath}~~~~~ = \langle(x+u-\overline{x}-\overline{u})(y+v-\overline{y}-\overline{v})\rangle \end{displaymath} (6.34)
 

 

\begin{displaymath}~~~~~ = \langle(x-\overline{x})(y-\overline{y})\rangle +\langle(u-\overline{u})(v-\overline{v})\rangle = V_{xy} + V_{uv}\end{displaymath} (6.35)
 

because other terms in (6.2) have expectation values of zero if the errors are uncorrelated with the values. The other elements of the covariance matrix are similarly related to the individual contributions.

Because x* and y* are the measured quantities, the estimator of the correlation coefficient that is obtained from them is

\begin{displaymath}r_{x^*y^*} = {{V_{x^*y^*}} \over{\sqrt{V_{x^*x^*}V_{y^*y^*}}......{{V_{xy}+V_{uv}}\over{\sqrt{(V_{xx}+V_{uu})(V_{yy}+V_{vv})}}}\end{displaymath} (6.36)
  
\begin{displaymath}~~~~~~ \ne \biggl({{V_{xy}}\over{\sqrt{V_{xx}V_{yy}}}} = \rho_{xy}\biggr) .\end{displaymath} (6.37)
 

Similarly,

\begin{displaymath}b_{y^\prime} = {{V_{x^\prime y^\prime}}\over{V_{x^\primex^\prime}}} = {{V_{xy}+V_{uv}}\over{V_{xx}+V_{uu}}} \end{displaymath} (6.38)
  
\begin{displaymath}~~~~~~ \ne \biggl({{V_{xy}}\over{V_{xx}}} = b_y\biggr) . \end{displaymath} (6.39)
 

Thus measurement errors can introduce biases in the estimated slope and correlation coefficient from a regression analysis.
 


Exercise 6.1: A set of 25 corresponding measurements of $\{x\}$ and $\{y\}$ give a correlation coefficient of 0.7. The estimated measurement uncertainty is 50% of the measured standard deviation for both x and y. What is the best estimate of the true correlation coefficient between x and y, and what are the one-standard-deviation error limits in this estimate?


next up previous contents
Next: 6.3 Linear regression with Up: 6. Linear Regression Analysis Previous: 6.1 Simple linear regression 


NCAR Advanced Study Program
http://www.asp.ucar.edu