| (4.1) |
The parameters
influence the distribution function, but are generally unknown. The task
of estimation is to determine functions of the observations
to use as estimates of the parameters
.
An estimator of aj can be any function
used to estimate the true value of the parameter aj.
The sample mean and sample standard deviation are often used as estimators
of the true population mean and standard deviation, for example. Desirable
characteristics of estimators are:
| (4.2) |
This joint probability function is called the likelihood and
depends on the parameters
.
If the likelihood function is plotted as a function of a for the
case with a single parameter, the resulting distribution will have a shape
somewhat like Fig. 4.1. The value a*, for which the likelihood
reaches its maximum value, is the maximum-likelihood estimate for the parameter
a.

For numerical convenience, it is usually preferable to calculate instead the function W defined as
| (4.3) |
Because W is a monotonic function of
,
the maximum in Wwill coincide with the maximum in
.
However, because the calculation of W involves a summation rather
than a product, there are computational advantages to the use of W:
| W | (4.4) | ||
| (4.5) |
The maximum-likelihood estimate of the parameters
satisfies the simultaneous equations
| (4.6) |
The maximum-likelihood estimator has several desirable properties:
| (4.7) |
and
| (4.8) |
where C1 and C2 are constants. Differentiating
W twice isolates
:
| |
(4.9) |
| |
(4.10) |
It is often simplest to use (4.10)
directly, rather than evaluate the second derivative in (4.11),
particularly when there is a single parameter to be determined. When ajdiffers
from aj* by
,
the term on the right side of (4.10)
decreases by 1/2 from its maximum value, so for uncorrelated errors an
estimate of the standard deviation in the result can be found by finding
the deviation in aj from aj*
that causes W to reduce by 1/2.
Maximizing W is equivalent to minimizing the chisquare function, defined as
| (4.11) |
Because
increases by 1 when W decreases by 1/2, the standard deviation in
the estimate of aj* can also be estimated
from the deviation that causes unity increase in the chisquare.
In a case where the fit to the measurements is poor, perhaps because an inappropriate distribution function was used, the likelihood will have a value much smaller than expected. In that case, the estimates of uncertainty obtained from (4.6-4.8) should not be used. Instead, the proper conclusion is that the model used is inappropriate because it does not provide an adequate fit to the observations. Erroneously small estimates of uncertainty limits sometimes arise from using (4.6-4.8) when the fit is poor.