next up previous contents
Next: 6. Linear Regression Analysis Up: 5. Least-Squares Methods of Previous: 5.5 Fitting subject to

5.6 Fitting to minimize the distance of points from a line

This section is for reference only, and should be considered optional. The fit procedure described here is not readily available in the usual textbooks and references, but is often useful, so it is presented here in detail despite the complexity required for good justification of the method.

The least-squares procedure discussed in Section 5.2 was based on the assumption that all the uncertainty could be assigned to one parameter, the "dependent" parameter. If the dependent and independent variables are interchanged, the resulting fit to a set of measurements will be different in most cases. This distinction becomes particularly important for the "regression" analyses to be discussed in the next chapter. However, it is often desirable to consider two measurements on an equal basis, e.g., when comparing two similar instruments without assuming either is the standard.

The following fit procedure will determine the best-fit line that minimizes the distance of a set of measurements, in a least-squares sense. Figure 5.1 shows the distance to be minimized. The following assumes for simplicity that the uncertainties are the same for the two instruments; if that is not the case, these formulas can still be used for new variables scaled to make the uncertainties equal.


 
Figure 5.1: Illustration of the case where two instruments with equal measurement uncertainty make a set of corresponding measurements. The distance of the points from the best-fit line is shown.

If the line is described by y=y0+bx, the distance from point (xi,yi) to the line is

\begin{displaymath}d={{x_ib-y_i+y_0)}\over{\sqrt{b^2+1}}} . \end{displaymath} (5.58)
 

Proof: See Fig. 5.2.  Let A be the vector with components ( xi,yi-y0). If the slope of the line is b, the unit vector B perpendicular to the line has components ( $b/\sqrt{b^2+1}$$-1/\sqrt{b^2-1}$), [note 7/7/2000: error here, last radical should be \sqrt{b^2+1}, not \sqrt{b^2-1}] as can be verified by checking that the dot product of this vector with a vector (1,b) along the line is zero. The distance d of a point (xi,yi) from the line is then A$\cdot$B, giving (5.58).


 
Figure 5.2: Vectors used to determine the distance of a point from a line.

The appropriate chisquare function then is

\begin{displaymath}\chi^2(y_0,b) = {{1}\over{\sigma^2}} \sum_i{{(x_i-y_i+y_0)^2}\over{b^2+1}} \end{displaymath} (5.59)
 

or, if $x_i^\prime=x_i-\overline{x}$ and $y_i^\prime=y_i-\overline{y}$,

\begin{displaymath}\chi^2 = {{1}\over{\sigma^2}} \sum_i{{(x_i^\prime b-y_i^\prime+y_0^\prime)^2}\over{b^2+1}} \end{displaymath} (5.60)
 

where $y_0^\prime=y_0+\overline{x}b-\overline{y}$. The least-squares fit then satisfies the requirement that

\begin{displaymath}{{\partial\chi^2}\over{\partial y_0^\prime}} ={{2}\over{\sig......um_i {{(x_i^\prime b-y_i^\prime+y_0^\prime)}\over{b^2+1}} = 0 \end{displaymath} (5.61)
 

or

\begin{displaymath}y_0^\prime + \overline{y^\prime} - b\overline{x^\prime} = 0 .\end{displaymath} (5.62)
 

However, by their definitions, $\overline{x^\prime}$ and $\overline{y^\prime}$ are zero, so the best-fit value is $y_0^\prime=0$. Then,

\begin{displaymath}{{\partial\chi^2}\over{\partial b}} = {{2}\over{\sigma^2}} \s......-{{(x_i^\prime b-y_i^\prime)^2 b}\over{(b^2+1)^2}}\Bigr) = 0\end{displaymath} (5.63)
 

or

\begin{displaymath}(b^2+1)(\overline{{x^\prime}^2} b - \overline{x^\prime y^\pri......2\overline{x^\prime y^\prime} b+ \overline{{y^\prime}^2}) = 0 \end{displaymath} (5.64)
 
\begin{displaymath}b(\overline{{x^\prime}^2} - \overline{{y^\prime}^2}) + (b^2-1)\overline{x^\prime y^\prime} = 0 . \end{displaymath} (5.65)
 

In (5.65), if $\overline{x^\prime y^\prime}=0$, b = 0 (unless $\overline{{x^\prime}^2}=\overline{{y^\prime}^2}$, in which case all values of b provide equally good fits). This suggests defining new coordinates $x^{\prime\prime}$ and $y^{\prime\prime}$, rotated from $x^\prime$ and $y^\prime$ by an angle $\theta$ selected to give $\overline{x^{\prime\prime}y^{\prime\prime}}=0$ and hence $b^{\prime\prime}=0$. For a rotation by $\theta$,

\begin{displaymath}x^{\prime\prime} = x^\prime \cos\theta + y^\prime \sin\theta \end{displaymath} (5.66)
 
\begin{displaymath}y^{\prime\prime} = -x^\prime \sin\theta + y^\prime \cos\theta \end{displaymath} (5.67)
 

Then

\begin{displaymath}x^{\prime\prime}y^{\prime\prime} = -{x^\prime}^2\sin\theta\c......\cos^2\theta-\sin^2\theta) + {y^\prime}^2\sin\theta\cos\theta \end{displaymath} (5.68)
 

or

\begin{displaymath}\overline{x^{\prime\prime}y^{\prime\prime}} ={{\overline{{y^......}\sin(2\theta) + \overline{x^\primey^\prime}\cos(2\theta)= 0 \end{displaymath} (5.69)
 

so that the required rotation angle $\theta$ is specified from

\begin{displaymath}\tan(2\theta) = {{2\overline{x^\primey^\prime}}\over{\overline{{x^\prime}^2} - \overline{{y^\prime}^2}}}. \end{displaymath} (5.70)
 

Because in these coordinates $b^{\prime\prime}$=0, b=tan($\theta$), so (5.70) determines the slope of the best-fit line.

In applications, an ambiguity arises because $\tan(\alpha)=\tan(\alpha+\pi)$, so there are multiple solutions to (5.70) differing by $\pi/2$. As a result, there are two solutions for b corresponding to $b=\tan(\theta)$ and $b=-1/\tan(\theta)$. An easy way to resolve this ambiguity is to choose the value of b with the same sign as the correlation coefficient for correlation between x and y.

With respect to the original coordinates, the result is that the best-fit line is specified by

\begin{displaymath}b = \tan(\theta) \ \ {\rm or}\ \ b = - {{1}\over{\tan(\theta)}}\end{displaymath} (5.71)
 

where

\begin{displaymath}\tan(2\theta) = {{2(\overline{xy} -\overline{x}\thinspace\ov......rline{x^2}-\overline{x}^2)(\overline{y^2}-\overline{y}^2)}}\end{displaymath} (5.72)
 

and

\begin{displaymath}y_0 = \overline{y} - b\overline{x} \ . \end{displaymath} (5.73)
 
 


Exercise 5.2: Develop a method for applying this technique to the case where the measurement error in one variable, while constant, is different from that in the other variable. (Hint: There is a simple solution.)

SOURCES AND FURTHER READING
 

Bevington, P. R., 1969: Data Reduction and Error Analysis for the Physical Sciences. McGraw-Hill, New York, 336 pp.

Brownlee, K. A., 1965: Statistical Theory and Methodology in Science and Engineering. John Wiley and Sons, New York, 590 pp.

Press, W. H., Brian P. Flannery, S. A. Teukolsky, and W. T. Vetterling, 1992: Numerical Recipies in C. Second Edition, Cambridge University Press, Cambridge, 735 pp.


next up previous contents
Next: 6. Linear Regression Analysis Up: 5. Least-Squares Methods ... Previous: 5.5 Fitting subject to constraints 


 
NCAR Advanced Study Program
http://www.asp.ucar.edu