Developing genetic markers with next-generation sequencing

As you’ll recall, I gave a very crude overview of RAD-seq and GBS on Thursday. I promised that I’d post a link to a recent paper that provides a good review of these and other techniques. You’ll now find the link on the lecture detail page for last Thursday’s lecture and on the consolidated readings page. You’ll also discover that the paper isn’t so recent. It was published in 2011. One of the hazards of getting old is that 6-year-old papers now seem recent. Heck, papers that are 20 years old sometimes seem recent. In any case, it’s still a good overview, and if you use Scopus or Google Scholar to find out who cited it, you’ll find more recent reviews if you’re interested in what’s happening now.

Inbreeding coefficients and population genomics

We won’t discuss this very recent paper in class, but in case you think that I’m making things more complicated than they need to be, take a look at this abstract (and then read the whole paper, if you’re really interested):

Population structure can be described by genotypic correlation coefficients between groups of individuals, the most basic of which are the pair-wise relatedness coefficients between any two individuals. There are nine pair-wise relatedness coefficients in the most general model, and we show that these can be reduced to seven coefficients for biallelic loci. Although all nine coefficients can be estimated from pedigrees, six coefficients have been beyond empirical reach. We provide a numerical optimization procedure that estimates all seven reduced coefficients from population-genomic data. Simulations show that the procedure is nearly unbiased, even at 3x coverage, and errors in five of the seven coefficients are statistically uncorrelated. The remaining two coefficients have a negative correlation of errors, but their sum provides an unbiased assessment of the overall correlation of heterozygosity between two individuals. Application of these new methods to four populations of the freshwater crustacean Daphnia pulex reveal the occurrence of half-siblings in our samples, as well as a number of identical individuals that are likely obligately asexual clone mates. Statistically significant negative estimates of these pair-wise relatedness coefficients, including inbreeding coefficents that were typically negative, underscore the difficulties that arise when interpreting genotypic correlations as estimations of the probability that alleles are identical by descent.

Ackerman, M.S., P. Johri, K. Spitze, S. Xu, T.G. Doak, K. Young and M. Lynch.  2017.  Estimating Seven Coefficients of Pairwise Relatedness Using Population Genomic Data. Genetics (online early) doi:

Small corrections

This morning I was reviewing the notes I posted yesterday, and I found a couple of small typos. The version that’s posted now fixes the typos, so you have a good excuse for not having downloaded them yet.

New notes: Approximate Bayesian Computation and Population Genomics

I’ve posted the notes for Tuesday’s lecture on Approximate Bayesian Computation and for Thursday’s lecture on population genomics. We won’t have time to do more than scratch the surface of either topic, but we’ll dive into a few examples in enough detail that you should be able to understand papers or seminars where people present this kind of work, and you may be inspired to use some of the techniques in your own research. I’d particularly encourage you to think about ways in which you could use Approximate Bayesian Computation to gain insight into ecological or evolutionary processes that interest you. I’m no expert, but if there were enough interest in the topic, I’d be happy to organize a 1-credit seminar next year in which we dove in and learned more about it together.

Notes on AMOVA, Migrate-N, and IMa posted. Project #5 posted too.

I returned from my trip to Dublin1 Friday evening. Over the past day and a half, I’ve revised notes for this week’s lectures and put together Project #5. We’ll start on Tuesday by answering any questions left over from Nora’s lectures last week on detecting selection on molecular sequences and on evolution in multigene families. You’ll want to be sure that you understand Tajima’s D, and it will be obvious why I say that when you see Project #5.

Once we’ve finished dealing with questions, we’ll move onto discussing how to use molecular data, and especially nucleotide sequence data, to provide insights into the history and demography of populations. The first approach we’ll examine extends Wright’s F-statistics to nucleotide sequence data. If we have time, I’ll introduce some very sophisticated approaches to phylogeography, and we’ll discuss them in detail on Thursday.

Detecting selection on nucleotide sequences, evolution in multigene families

I’ve posted notes for next week’s lectures about detecting selection on nucleotide sequences (which will introduce you to Tajima’s D) and on evolution in multigene families (which will introduce you to orthology, paralogy, and concerted evolution). Remember that Nora will be presenting these lectures, since I’ll be in Dublin at the annual meetings of the Deans and Directors of Graduate Studies (DDoGS) for universities in Universitas 21 ( Nora knows population genetics very well, and it’s likely she’ll be able to answer any question you have. If for some reason she can’t, keep track of it. We’ll spend time dealing with any of those questions on the Tuesday when I return.

A history of molecular population genetics

In catching up on my reading today, I noticed that Sònia Casillas and Antonio Barbadilla have a review article in the March 2017 issue of Genetics on the history of molecular population genetics. I haven’t had a chance to read it carefully, but I did look it over quickly, and it appears to give a very nice overview of developments from the era of protein electrophoresis to the present day and population genomics. I’m pasting the abstract below, but I encourage you to follow the link and read the whole thing.

Casillas, S., and A. Barbabadilla.  2017.  Molecular population genetics. Genetics 205:1003-1035.  doi:

Molecular population genetics aims to explain genetic variation and molecular evolution from population genetics principles. The field was born 50 years ago with the first measures of genetic variation in allozyme loci, continued with the nucleotide sequencing era, and is currently in the era of population genomics. During this period, molecular population genetics has been revolutionized by progress in data acquisition and theoretical developments. The conceptual elegance of the neutral theory of molecular evolution or the footprint carved by natural selection on the patterns of genetic variation are two examples of the vast number of inspiring findings of population genetics research. Since the inception of the field, Drosophila has been the prominent model species: molecular variation in populations was first described in Drosophila and most of the population genetics hypotheses were tested in Drosophila species. In this review, we describe the main concepts, methods, and landmarks of molecular population genetics, using the Drosophila model as a reference. We describe the different genetic data sets made available by advances in molecular technologies, and the theoretical developments fostered by these data. Finally, we review the results and new insights provided by the population genomics approach, and conclude by enumerating challenges and new lines of inquiry posed by increasingly large population scale sequence data.

Project #4 posted

Just in case you want to get an early start on Project #4, I’ve posted a link to it on the lecture detail page for tomorrow’s lecture.  It’s not due until the Tuesday after spring break, but Nora will be in Texas getting experiments for her postdoc started during spring break, and I’ll be (mostly) unavailable during spring break, so I encourage you to get started on it soon.

First set of molecular evolution notes are posted

We shift gears on Tuesday and start our survey of molecular evolution. We’ll start with a review of the kinds of molecular variation that population geneticists have studied over the last 50 years, then we’ll discuss the neutral theory and some of the ways it’s been modified to fit our increasingly sophisticated understanding of molecular variation. After spring break Nora will talk about some of the ways we can detect selection on nucleotide sequences and about the evolution of multigene families. When I get back we’ll talk about how to use data from molecular markers to make inferences about the recent evolutionary history of individuals and populations, and we’ll finish with a very brief discussion of how the explosion of data that is coming with high-throughput sequencing is changing our approach to understanding evolutionary processes in populations.

Updated notes on selection and drift

I made some last-minute changes to the notes for tomorrow’s lecture, a short section on “genetic draft.” You read that right. It’s not a typo. Genetic draft is a thing – and we’ll discuss it briefly tomorrow. There’s still a permission problem with the notes, so follow the link with the PDF label on the lecture detail page, or click here.