I just told you that the method I described produces ``maximum-likelihood estimates'' for the allele frequencies, but I haven't told you what a maximum-likelihood estimate is. The good news is that you've been using maximum-likelihood estimates for as long as you've been estimating anything, without even knowing it. Although it will take me awhile to explain it, the idea is actually pretty simple.
Suppose we had a sock drawer with two colors of socks, red and
green. And suppose we were interested in estimating the proportion of
red socks in the drawer. One way of approaching the problem would be
to mix the socks well, close our eyes, take one sock from the drawer,
record its color and replace it. Suppose we do this
times. We know
that the number of red socks we'll get might be different the next
time, so the number of red socks we get is a random variable. Let's
call it
. Now suppose in our actual experiment we find
red
socks, i.e.,
. If we knew
, the proportion of red socks in the
drawer, we could calculate the probability of getting the data we
observed, namely
Of course we don't know
, so what good does
writing (4) do? Well, suppose we reverse the question
to which equation (4) is an answer and call the
expression in (4) the ``likelihood of the data.''
Suppose further that we find the value of
that makes the
likelihood bigger than any other value we could
pick.12 Then
is the maximum-likelihood estimate of
.13
In the case of the ABO blood group that we just talked about, the
likelihood is a bit more complicated
![]() |
(5) |