next up previous contents
Next: 7.4 Use of the Up: 7. Hypothesis testing Previous: 7.2 The Chisquare distribution

7.3 The F-test

If f(x) is used to approximate measurements $\{y_i\}$, and if the number of degrees of freedom is $\nu$, the sample estimate of the standard deviation is related to the chisquare by
\begin{displaymath}s^2 = {{1}\over{\nu}} \sum_i(y_i-f(x_i))^2 ={{\sigma^2}\over{\nu}}\chi^2 . \end{displaymath} (7.11)
 

Consider two samples taken from the same population, both characterized by the same standard deviation $\sigma$. Define

\begin{displaymath}F = {{s_1}^2\over{s_2}^2} = {{\chi^2_1/\nu_1}\over{\chi^2_2/\nu_2}}.\end{displaymath} (7.12)
 

The distribution function for F can be derived as the ratio of two chisquare distribution functions. It is:

\begin{displaymath}P(F,\nu_1,\nu_2) ={{\Gamma[(\nu_1+\nu_2)/2]}\over{\Gamma(\nu......)}}\over{(1+F{{\nu_1}\over{\nu_2}}^{{\nu_1+\nu_2}\over{2}}})} \end{displaymath} (7.13)
 

An approximation that is usually adequate is to use the following variable as a normal deviate:

\begin{displaymath}z = {{F^{{1}\over{3}} (1-{{2}\over{9\nu_2}}) - (1-{{2}\over{......over{({{2}\over{9\nu_1}}+F^{2/3}{{2}\over{9\nu_2}})^{1/2}}} .\end{displaymath} (7.14)
  

The F-test can be used to determine if two samples are consistent with a common origin. It is used to compare the sample variances, as follows. Consider an example where there are two sets of measurements to be tested for consistency, one with 6 degrees of freedom and a sample estimate of variance of s12=75 and a second with 10 degrees of freedom and a sample estimate of variance of s22=25. To determine if the two samples are different at the 90% confidence level:

1.
F=(s12/s22)=3 with f1=6 and f2=10.
2.
For a 90% confidence test, use a 5% test for both the upper and lower tails of the distribution.
3.
Reference tables7.2 show 3.22 to be the critical value of F for a 5% confidence interval. F=3.00 is thus less than this critical value, so the difference is not significant at the 5% level.
4.
It is also necessary to test if the ratio is too small. The 95% limit for the same ratio F(s12/s22) can be found by using the symmetry in the tables because the 95% limit for f1=6 and f2=10 is the inverse of the 5% limit for f1=10 and f2=6, so the lower limit is 1/4.08=0.245. The value 3.0 is well above this lower limit.
Thus the samples, while apparently quite different, do not fail a 90% confidence test that they are the same. It would be a serious misinterpretation of this test to conclude from these results that they are the same; the correct conclusion is that the hypothesis that they are the same cannot be rejected with 90% confidence. Indeed, the test will fail at about the 87% level, or alternately a one-sided test (applicable if the direction of the difference between the samples is prescribed in advance) will fail at about the 94% level, so there is a strong indication that the two samples are different even though the posed hypothesis cannot be rejected at the 90% confidence level.

When the Gaussian approximation is used, the two test values for z are z=1.550 and $z^\prime$ =-1.550. These values correspond to the 93.9% and 6.1% cumulative points in the Gaussian distribution, so the test would fail a test with about an 88% confidence limit although it passes the 90% test. The accuracy of this approximation is demonstrated by evaluating z for F=3.22, f1=6 and f2=10, which gives z=1.645, a value corresponding to the 0.9500 point in the cumulative Gaussian distribution function. The Gaussian approximation is thus very accurate in this case, and is almost always acceptable.


next up previous contents
Next: 7.4 Use of the Up: 7. Hypothesis testing Previous: 7.2 The Chisquare distribution 


NCAR Advanced Study Program
http://www.asp.ucar.edu