EM Algorithm for ABO frequencies


Your browser does not support Java, or Java is not enabled. Sorry!


Instructions


Explanation

The EM algorithm is a method of calculating maximum-likelihood estimates through a two-step iteration: Expectation and Maximization. In this application we start with a guess for the underlying allele frequencies (a, b, & o). If those were the correct frequencies, then the expected number of genotypes in each category is:

AA: (a2/(a2 + 2ao))NA
AO: (2ao/(a2 + 2ao))NA
BB: (b2/(b2 + 2bo))NB
BO: (2bo/(b2 + 2bo))NB
AB: (2ab)NAB
OO: (o2)NO

This is the Expectation stage. Given those expected numbers, new guesses for the allele frequencies can be calulated from the maximum-likelihood estimates associated with them, i.e,

a = (2AA + AB + AO)/(2(AA + AO + BB + BO + AB +OO))
b = (2BB + AB + BO)/(2(AA + AO + BB + BO + AB +OO))
o = (2OO + AO + BO)/(2(AA + AO + BB + BO + AB +OO))

This is the Maximization stage. The estimates of a, b, and o obtained from this stage are used for another round of Expectation and Maximization, and the process is repeated until the frequencies don't change.


If this simulation isn't displaying properly, try this one.


webmaster@darwin.eeb.uconn.edu