Class time and location
|EEB 5348 channel on UConn's Kaltura
Course instructor and TA
|2nd floor Whetten Graduate Center
|Nick Van Gilder
This course is an introduction to the field of population genetics, the branch of evolutionary biology concerned with the genetic structure of populations and how it changes through time. Some of us see population genetics as the core discipline in evolutionary biology since changes in the genetic composition of a population are the basis for all other evolutionary change within lineages.
When you finish this course you should:
- Know the 8 assumptions that underly the Hardy-Weinberg principle and the evolutionary consequences of violating each of them.
- Be able to estimate and interpret Wright’s F-statistics.
- Be able to use individual assignment to study population structure.
- Be able to recognize the different patterns of natural selection and predict the consequences of each.
- Be able to explain the difference between census and effective population size and to describe how genetic drift interacts with mutation, migration, and natural selection.
- Understand how the coalescent process can be used to provide insight into processes that produced the patterns of genetic variation we see.
- Provide explanations for different rates of molecular evolution in different molecules and at different nucleotide positions.
- Know what statistical tests can be used to detect natural selection and population size changes from samples of nucleotide sequence variation.
- Understand several widely used approaches for understanding evolutionary relationships among populations.
- Know the difference between narrow sense and broad sense heritability.
- Know how to interpret the breeder’s equation.
- Understand the principles underlying genome wide association studies, including genomic prediciton.
There are two aspects of this course that sometimes cause students problems.
- Geneticists think differently from most other biologists (and most other human beings, for that matter). They love monohybrid and dihybrid crosses, linkage, penetrance, dominance, and the like. We population geneticists are even worse. To explain things that you can see (like phenotypic differences among individuals) we introduce abstract concepts (like additive genetic variance) that are pure statistical artifacts that no one can see. By the time you finish this course, you’ll not only have had a good review of basic Mendelian genetics (and even a little bit of molecular genetics), you’ll be familiar with a bunch of new and fairly abstract genetic concepts. Just what you were looking for, right?
- Population genetics involves a fair amount of mathematics, probability theory, and statistics. That’s because we deal with genetic variation in populations, which is measured in terms of gene and genotype frequencies. The phenomena of Mendelian genetics are themselves inherently statistical. So it shouldn’t be surprising that when we apply these principles to a whole population the problems become even more mathematically involved.
That’s the bad news. The good news is that the math we need is (mostly) quite simple, some basic algebra and probability theory. When we need things that are more advanced, I’ll explain them in class. The other good news is that I expect you to have lost any familiarity you once had with genetics, algebra, and probability, so we’ll be doing almost everything from scratch. The last bit of good news is that I’ll try to emphasize how to apply the basic principles of population genetics, not the math involved in deriving those principles.
I’ll be placing particular emphasis on using different computer packages for analysis and interpretation of data encountered in population genetics, and the problems and projects will involve using those packages. The lab exercises will evaluate your ability to use the principles and methods of population genetics, not your ability to derive them.
The course consists of two components:
- The lecture component.
- The lab component.
The two components of the course are tightly integrated. In fact, the grading will be based entirely on weekly lab exercises and on three longer projects that we will also work on in lab. The lecture will introduce the concepts and principles you need to understand and apply population genetics. The lab will provide the “hands-on” experience using real (if somewhat simplified) data sets.
All of the lectures use notes that are available on the Notes page and pages directly linked to individual class periods from the Lecture schedule page. Some of the individual lecture pages will include links to published papers related to the topic being discussed. The links to those papers are there to provide you with additional background material in case you want to delve more deeply into the topic than we have time for in class. In a few cases, I may ask you to read a particular paper ahead of time so that we can discuss it during lecture.
There is a laboratory session scheduled from 9:30am-11:30am every Tuesday in TLS 181. Nick will be leading the laboratory sessions. We aren’t requiring attendance, since we know that some of you are likely to have scheduling conflicts with other classes or teaching responsibilities. We do, however, encourage you to attend when you can. You’ll find it helpful to begin your work on projects together even if you do most of your work independently.
We’ll do as much of our work in the statistical package R as possible, but there will be a few cases where we’ll need to use another program. Since I’m sure that some of you will be using Windoze machines and others will be using Macs, I’ll make sure that any software we use is available for both platforms. The laboratory exercise each week will introduce a small example that illustrates a key principle from lecture. I’ll post the exercises by Monday morning each week, and they will are due by 5:00pm Friday on the week when they were assigned. Nick will do his best to get them back to you by the following Monday.
I emphasize the use of R because it is very portable and very powerful. The interface is quite similar on Mac, Windoze, and Linux, and packages are generally available on all of these platforms. It is also a very powerful and very flexible general-purpose statistical package. You are likely to use it a lot, even if you never do any work in population genetics again.
|Number of assignments per component
|Points per assignment
Grading in the course is based on your performance on 3 projects and 10 lab exercises.In the first week, the lab will focus on making sure that you have R set up on your computer and that you know how to install packages. Each project may include a small amount of background reading for context, but the lab exercises will be self contained. When the data sets are derived from published papers, I’ll include a reference and a link to the paper. Sometimes the data will be from simulations, in some cases simulations that I do ahead of time, in others simulations that you’ll do on your own. The assignment will identify a small number of questions, typically two or three, that can be addressed using the data. Your task will be to identify and perform the appropriate analyses and to interpret the results of those analyses in light of the questions posed in the assignment. I will clean and simplify the data before we provide it to you so that you can focus on using the principles you’ve learned to answer the questions.
A note about ChatGPT
ChatGPT took the world by storm last November. If you’ve used it at all, you will realize that it is very powerful. You will also discover that if you ask it how to write code in R, it does a pretty good job.I know that I’ll show an example when we get to genetic drift, but I may show some other examples along the way. You are free to use ChatGPT to help you write any R code that you need to write for the laboratory exercises, but if you use ChatGPT please be sure to take a very careful look at the code it produces. It does a very good job most of the time, but it sometimes fails badly – and we don’t want you to fail.