EM Algorithm for ABO frequencies


Your browser does not support Java, or Java is not enabled. Sorry!


Instructions


Explanation

The EM algorithm is a method of calculating maximum-likelihood estimates through a two-step iteration: Expectation and Maximization. In this application we start with a guess for the underlying allele frequencies (a, b, & o). If those were the correct frequencies, then the expected number of genotypes in each category is:

AA: (a2/(a2 + 2ao))NA
AO: (2ao/(a2 + 2ao))NA
BB: (b2/(b2 + 2bo))NB
BO: (2bo/(b2 + 2bo))NB
AB: (2ab)NAB
OO: (o2)NO

This is the E expectation stage. Given those expected numbers, new guesses for the allele frequencies can be calulated from the maximum-likelihood estimates associated with them, i.e,

a = (2AA + AB + AO)/(2(AA + AO + BB + BO + AB +OO))
b = (2BB + AB + BO)/(2(AA + AO + BB + BO + AB +OO))
o = (2OO + AO + BO)/(2(AA + AO + BB + BO + AB +OO))

This is the Maximization stage. The estimates of a, b, and o obtained from this stage are used for another round of Expectation and Maximization, and the process is repeated until the frequencies don't change.


webmaster@darwin.eeb.uconn.edu
Last modified: Mon Jan 18 17:53:07 EST 1999