next up previous contents
Next: 3.4 Poisson distribution Up: 3. Probability Distribution Functions Previous: 3.2 Gaussian or normal

3.3 The binomial distribution

Suppose that there is a probability p that a particular event will occur, and therefore a probability (1-p) that the event will not occur, in a given trial. In a set of N trials, what is the probability that there will be n events? For example, the events might be coin tosses where the event in question is "heads" with probability p=1/2.  A first guess might be $\phi(n) = p^n (1-p)^{N-n}$, and this would be correct if the order of events were specified. However, if the order is not specified, there may be many different sequences that lead to the same final number of events, and the probability must correct for the multiple ways that the final number can be reached. The correction factor is called the binomial coefficient, and the resulting probability distribution is the binomial distribution function:
\begin{displaymath}\Phi_B(n;p,N) = \Bigl({{N}\atop{n}}\Bigr) p^n (1-p)^{N-n}\end{displaymath} (3.3)
 

where

\begin{displaymath}\Bigl({{N}\atop{n}}\Bigr) = {{N!}\over{n!(N-n)!}} . \end{displaymath} (3.4)
 

Figure 3.3 shows an example for 30 events and a probability p = 0.4.


 
Figure 3.3: The binomial distribution functions (3.3) [heavier line] for p=0.4 and p=0.2, both with N=30. For comparison, the Gaussian distribution functions having the same mean and standard deviation are also plotted as the thinner smooth lines. Values for the binomial distribution function are shown only for integer numbers of events, with adjacent values connected by straight lines.

The mean of the distribution is given by $\mu=pN$, as can be demonstrated by integration of the probability distribution function. The variance is given by

\begin{displaymath}\sigma^2 = N p (1-p) . \end{displaymath} (3.5)
 

Figure 3.3 also shows a comparison between Gaussian and binomial distribution functions having the same mean and standard deviation. For these conditions, the distribution functions are almost indistinguishable.

The binomial distribution characterizes the probability of discrete events, while the Gaussian distribution describes the probability of a continuously varying result. Both distributions describe events that are independent. Violation of this assumption is a common source of error. For example, if in 100 days rain is observed on 40 days, one might erroneously estimate that the standard deviation in the number of rain events is $\sqrt{(100)(0.4)(0.6)}\approx 5$. Because rain events in most locations are highly correlated from day to day, rain events cannot be treated as independent, and this use of the binomial distribution would usually underestimate the true variability.


next up previous contents
Next: 3.4 Poisson distribution Up: 3. Probability Distribution Functions Previous: 3.2 Gaussian or normal 


 
NCAR Advanced Study Program
http://www.asp.ucar.edu