Uncommon Ground

Academics, biodiversity, genetics, & evolution

Latest Posts

Sharing a new version of my genetic drift simulation

You may be aware that I wrote a series of applications in RShiny several years ago to illustrate some principles of population genetics. I just finished revising the genetic drift application. If you’ve used it in the past, you’ll know that it would get hung up when you tried to simulate a long time series or a lot of populations. After some digging around, I realized that the problem isn’t with running the simulation or with collecting the results. It’s with converting the results to a form that allows the simulation to unfold over time.

As a result, this version allows you to turn the animation off. Now you can run long time series with lots of populations (where “lots” equals “up to 10”). You won’t see the results played as a movie, but you’ll see them displayed very quickly. As you’ll see from the first link above, all of the source code is available on Github. If you find any of these applications useful, you’ll want to take a look at the Google Doc that Katie Lotterhos put together and announced on Twitter last January. It includes screenshots and links to applications written by CJ Battey, Graham Coop, and Chris Muir.

Genetic structure and clonal diversity in an important Chinese grass

Since you’re reading this blog, you must know that I don’t have a lot of time for research these days. My duties as Vice Provost for Graduate Education and Dean of The Graduate School at UConn take up most of my time. I do manage to contribute to some research, so long as other people do the real work and I contribute some ideas or some statistical analyses. Here’s another example of that.

Last fall I was asked about the old C++ program Hickory that I had written to facilitate analysis of Wright’s F-statistics with dominant markers. It was never terribly widely used, and it was difficult to maintain. I gave up about 10 years ago. In the meantime, I realized that there’s an easy way to rewrite Hickory using Stan. After being contacted, I finally bit the bullet and did the rewrite in a combination of Stan and R. I even mentioned the R/Stan implementation last September.

Yesterday, we posted a pre-print on bioRxiv that uses the new version of Hickory as one of a variety of analytical methods that provide some insight into the genetic structure of Leymus chinensis. Here’s the abstract and a link.

Genetic structure in patchy populations of a candidate foundation plant: a case study of Leymus chinensis (Poaceae) using genetic and clonal diversity

Jian Guo, Christina L. Richards, Kent E. Holsinger, Gordon A. Fox, Zhuo Zhang, Chan Zhou

doi: https://doi.org/10.1101/2021.06.12.448174

PREMISE The distribution of genetic diversity on the landscape has critical ecological and evolutionary implications. This may be especially the case on a local scale for foundation plant species since they create and define ecological communities, contributing disproportionately to ecosystem function.

METHODS We examined the distribution of genetic diversity and clones, which we defined first as unique multi-locus genotypes (MLG), and then by grouping similar MLGs into multi-locus lineages (MLL). We used 186 markers from inter-simple sequence repeats (ISSR) across 358 ramets from 13 patches of the foundation grass Leymus chinensis. We examined the relationship between genetic and clonal diversities, their variation with patch-size, and the effect of the number of markers used to evaluate genetic diversity and structure in this species.

RESULTS Every ramet had a unique MLG. Almost all patches consisted of individuals belonging to a single MLL. We confirmed this with a clustering algorithm to group related genotypes. The predominance of a single lineage within each patch could be the result of the accumulation of somatic mutations, limited dispersal, some sexual reproduction with partners mainly restricted to the same patch, or a combination of all three.

CONCLUSIONS We found strong genetic structure among patches of L. chinensis. Consistent with previous work on the species, the clustering of similar genotypes within patches suggests that clonal reproduction combined with somatic mutation, limited dispersal, and some degree of sexual reproduction among neighbors causes individuals within a patch to be more closely related than among patches.

The link between traits and performance in Protea

If you’re reading this, you probably know enough about me to know that my students and I have been working on Protea for the last 10-15 years. Today I am pleased to report that the most recent work, from Kristen Nolting’s PhD dissertation has appeared in Annals of Botany. The advance publication version appeared nearly a year ago, but the paper is officially out in a special issue focusing on intraspecific trait variation in plants. Here’s the abstract and a link.

Intraspecific trait variation influences physiological performance and fitness in the South Africa shrub genus Protea (Proteaceae)

Kristen M Nolting, Rachel Prunier, Guy F Midgley, Kent E Holsinger

Background and Aims

Global plant trait datasets commonly identify trait relationships that are interpreted to reflect fundamental trade-offs associated with plant strategies, but often these trait relationships are not identified when evaluating them at smaller taxonomic and spatial scales. In this study we evaluate trait relationships measured on individual plants for five widespread Protea species in South Africa to determine whether broad-scale patterns of structural trait (e.g. leaf area) and physiological trait (e.g. photosynthetic rates) relationships can be detected within natural populations, and if these traits are themselves related to plant fitness.

Methods

We evaluated the variance structure (i.e. the proportional intraspecific trait variation relative to among-species variation) for nine structural traits and six physiological traits measured in wild populations. We used a multivariate path model to evaluate the relationships between structural traits and physiological traits, and the relationship between these traits and plant size and reproductive effort.

Key Results

While intraspecific trait variation is relatively low for structural traits, it accounts for between 50 and 100 % of the variation in physiological traits. Furthermore, we identified few trait associations between any one structural trait and physiological trait, but multivariate regressions revealed clear associations between combinations of structural traits and physiological performance (R2 = 0.37–0.64), and almost all traits had detectable associations with plant fitness.

Conclusions

Intraspecific variation in structural traits leads to predictable differences in individual-level physiological performance in a multivariate framework, even though the relationship of any particular structural trait to physiological performance may be weak or undetectable. Furthermore, intraspecific variation in both structural and physiological traits leads to differences in plant size and fitness. These results demonstrate the importance of considering measurements of multivariate phenotypes on individual plants when evaluating trait relationships and how trait variation influences predictions of ecological and evolutionary outcomes.

Annals of Botany 127:519–531; 2021 https://doi.org/10.1093/aob/mcaa060

UConn Reads 2021 – Truth, Democracy, & Climate Change (the video)

Two weeks ago on the afternoon of March 25th, the UConn Humanities Institute hosted a panel discussion on Truth, Democracy, & Climate Change. Tom Bontly (Philosophy) moderated the panel, which included two distinguished philosophers, Elizabeth Anderson (University of Michigan) and Lee McIntyre (Boston University). For some reason, Tom also asked me to participate. After introductions and brief remarks, we had a lively discussion about why some people do not accept the evidence for human-caused climate change. Elizabeth mentioned, for example, Dan Kahan‘s idea that there are “conflict entrepreneurs” who purposely promote disinformation, not because it necessarily promotes the cause they appear to be supporting, but because they profit from the conflict in other ways. If you couldn’t join us on the 25th and you’re interested in learning more, you’re in luck. Like nearly every other seminar, meeting, or and panel in the last year, this one was held virtually. It was also recorded, and the edited video (with captions) is now available on YouTube.

A Call to Action: Marshaling Science for Society – AIBS President and Past-Presidents

Greg Anderson, Gene Likens, and I are Past Presidents of the American Institute of Biological Sciences, an organization representing nearly 120 scientific societies, museums, botanical gardens, and universities (including the Department of Ecology & Evolutionary Biology at UConn). We joined Charles Fenster, current President of AIBS, and 20 of our fellow Past Presidents in writing A call to action: marshaling science for society, which appeared in BioScience today. Here are the first and the concluding paragraphs of what we wrote:

As the current and past presidents of the American Institute of Biological Sciences (AIBS), we find the assault by politicians and special interest groups on the use of scientific knowledge to guide public policy decision-making alarming and dangerous. The marginalization of scientific information in decision-making has significant negative effects on our public health and safety, our environmental sustainability, and our general well-being. We need not look further than the disruption and deaths that have resulted in many countries, including the United States, from failing to use scientific evidence in making decisions on how to control the COVID-19 pandemic.

The progress of science over the centuries has led to our deep understanding of natural phenomena. We must find ways to benefit from that understanding as we move into the future. Let us join together to insist on acting logically and rationally in a world so plagued by self-centered short-term goals and the false information that they all too often generate.

If you’d prefer a podcast version, here’s the link: http://bioscience-talks.aibs.org/science-leaders-issue-clarion-call-for-evidence-based-policy. Twenty-four organizations have endorsed the statement. You can see them here: https://aibs.wufoo.com/reports/aibs-past-and-current-presidents-viewpoint/.

I hope you’ll take time to read the call or listen to the podcast. If you belong to a society that has not yet endorsed the viewpoint, I encourage you to ask leaders of your society to sign on.

Causal inference in ecology – An update

Causal inference in ecology – links to the series

A couple of years ago I wrote a series of posts on causal inference in ecology. In it I explored the Rubin causal model and concluded that

the Rubin causal model isn’t likely to help me make causal inferences with the kinds of observational data I collect.

I haven’t changed my mind about that, but I do have an update.

I’ve been reading Regression and other stories, by Andrew Gelman, Jennifer Hill, and Aki Vehtari, which I highly recommend reading if you use regression for any purpose in your research. I just finished Chapter 21, “Additional topics in causal inference”, and the last section, 21.5 “Causes of effects and effects of causes”, is particularly relevant to my earlier conclusion. Not surprisingly, Gelman, Hill, and Vehtari (GHV) have a better way of explaining the role that regression can play in generating hypotheses than I did. You’ll need to read the chapters on causal inference (or be familiar with the Rubin causal model) to fully appreciate their insight, but here it is in a nutshell.

We can make inferences about the effect of a cause when we (a) identify an intervention (a cause) that may have an effect and (b) randomize the intervention across experimental units (or do something that mimics random assignment by balancing on potential pre-observation confounders or by using an instrumental variable, a regression discontinuity, or difference-in-differences approach). Thought about in this way, the purpose of statistical analysis is to estimate the magnitude of an effect.

The regression analyses I typically do can be cast as an attempt to make inferences about the cause of an effect.1 Here’s where GHV have a better way of thinking about it than I did. Let’s suppose that I’m interested in environmental features that influence stomatal density, the example I discussed on 11 June 2018. I illustrate there that three principal components describing aspects of the environment show strong associations with stomatal density. GHV remind us that some other variable (or set of variables) could cause the observed differences in stomatal density and that once we’ve taken that variable into account, none of the PCs would show an association with stomatal density.2 More importantly, they point out that the association suggests causal hypotheses that could account for the association. To the extent that its important to us to dissect those causes, we can then do new experiments or make new observations (using Rubin’s causal model as a framework if we’re going to make causal inferences from an observational study) structured to estimate the effects those hypotheses suggest.

  1. I wrote “can be cast as an attempt”, because I do my damndest to make it clear that I’m only asserting that certain variables have stronger associations with the outcome I’m studying than others, not that those variables cause the outcome.
  2. Fortunately (for me), that’s consistent with what I wrote two years ago.

An R-Stan implementation of Bayesian inference for Wright’s F-statistics

Some of you know that many years ago Paul Lewis, Dipak Dey, and I wrote a paper describing a Bayesian approach to inferring population structure from dominant markers.1 You may also know that Paul Lewis and I wrote a Windoze program in C++, Hickory, that implemented the approach. We later extended Hickory for analysis of co-dominant markers. Later still, Feng Guo, Dipak, and I wrote another paper describing a Bayesian approach to (a) estimating population- and locus-specific effects on FST and (b) identifying loci where the posterior distribution of FST is markedly different from the overall estimate.2 If you know that (and maybe even if you don’t know all of that), you also know that Paul and I stopped maintaining Hickory a number of years ago. I moved from Windoze to Mac, and the library we were using to support the graphical user interface became too complicated for me to keep up with.

I’ve had a few requests from people who were interested in using Hickory, but I just haven’t had the time to find a way to support them – until now.

Over the past several years, I’ve been using Stan for many different statistical analyses. When I received another request for Hickory a couple of weeks ago, I realized that I could pretty easily develop a new version of Hickory in R/Stan. This approach has several advantages over the standalone C++ code in the original Hickory.

  1. I don’t have to worry about writing the MCMC sampler myself. I use the very sophisticated Hamiltonian Monte Carlo in Stan. I not only avoid me the bother of writing my own sampler, I have much greater confidence that the sampler is performing correctly. It’s written and maintained by experts, and the convergence diagnostics are far more sophisticated than for Metropolis-Hastings.
  2. It should be readily portable to any platform on which R is supported. The only requirement, for now, is that you have a C++ compiler installed. If you’re running a Mac, you may need to download Xcode. If you’re running Linux, you should be all set. If your running Windows, you can download Rtools from CRAN. I intend to submit the R package I’ve written to CRAN once I’ve tested it more thoroughly and provided some extensions to the crude functionality currently available. Once it’s on CRAN, you won’t even need a C++ compiler.
  3. I can develop an interface to adegenet and other R packages used for analysis and manipulation of genetic data so that Hickory can use data in many different formats supported by other packages.

A very early release of Hickory is available at GitHub. You should find all of the information you need to install and use it there. Let me know if you run into problems. I’ll do my best to walk you through them (and probably correct some errors I’ve made or at least improve the meager documentation in the process).

  1. Holsinger, K. E., Lewis, P. O., and Dey, D. K. 2002. A Bayesian approach to inferring population structure from dominant markers. Molecular Ecology 11:1157–1164.
  2. Guo, F., Dey, D. K., and Holsinger, K. E. 2009. A Bayesian hierarchical model for analysis of SNP diversity in multilocus, multipopulation samples. Journal of the American Statistical Association 104:142–154. http://doi.org/10.1198/jasa.2009.0010

Making accessible HTML from LaTeX sources — an additional experiment

Last week I reported on my initial experiments using Pandoc and LaTeXML to convert LaTeX to HTML. Here are links to the PDF produced with pdfLaTeX and the HTML:

If you’re like me, you’ll prefer the LaTeXML version to the Pandoc version, but as I pointed out the LaTeXML version includes CSS to customize the styling and the Pandoc version doesn’t. I did a quick Google search, figured out how to add CSS (and a table of contents) to the HTML output from Pandoc, and found a very nice CSS style to use (from Pascal Hertlief on Github). It’s possible that I’ll fiddle with Pascal’s CSS a bit, but there’s a good chance I won’t change it at all. It makes the HTML look really, really nice:

What I haven’t tried yet is converting LaTeX source that includes PDF figures. Let’s try that now and see how it works.

It took a while to get ImageMagick installed, to write a short Perl script to change all of the references to EPS files into references to PNG files and convert the EPSs to PNGs, but I really like the results. But this gets two of my three “to-dos” out of the way.

 

  • Check CSS styling for Pandoc.
  • Show the results to an accessibility expert at UConn and get some feedback on the different approaches.
  • See what happens with figures when they’re included in a LaTeX document.

Now I just (just?) need to check with an accessibility expert to confirm that the HTML is accessible. If it is, I’m all set.

By the way, if you’re interested in seeing the Perl script, let me know. It will be posted in the Github archive where I post the LaTeX source for my notes later this fall, but I’d be happy to send you a copy now if you drop me a line.

Making accessible HTML from LaTeX sources – some initial impressions

Some of you know that I’ve been making notes from my graduate course in Population Genetics available online for nearly 20 years (http://darwin.eeb.uconn.edu/uncommon-ground/eeb348/notes/). What a smaller number of you know is that I use LaTeX to write my notes and pdfLaTeX to produce PDFs from the LaTeX source. So far as I can tell (using ANDI), the PDFs produced in this way provide some elements that aid accessibility, but I am exploring options to produce HTML from the same source that might produce documents that are accessible to more readers. For my first experiment, I used the LaTeX file from 2019 that produced notes on resemblance among relatives. Here are links to three versions of the notes:

Both approaches to producing HTML are straightforward.

For Pandoc:

pandoc --standalone --mathjax -o quant-resemblance-pandoc.html quant-resemblance.tex

For LaTeXML:

latexml --includestyles --dest=quant-resemblance.xml quant-resemblance.tex
latexmlpost --dest=quant-resemblance-latexml.html quant-resemblance.xml

With the default options, I like the look of the LaTeXML version better, but it also includes CSS customizations and the Pandoc version doesn’t. It’s probably possible to include customized CSS with Pandoc, but I haven’t had a chance to investigate that yet. I also haven’t had a chance to consult anyone who knows how to judge accessibility of documents. When I’ve had a chance to do that. I’ll return with a report. (Don’t hold your breath. I am a dean, so I don’t have a lot of time on my hands.)

Here’s my to-do list, so that I don’t forget:

  • Check CSS styling for Pandoc.
  • Show the results to an accessibility expert at UConn and get some feedback on the different approaches.
  • See what happens with figures when they’re included in a LaTeX document.

If you have additional questions, let me know, and I’ll add them to the list.

Congratulations to recipients of graduate degrees from @UConn

On Saturday, the University of Connecticut celebrated a virtual Commencement. As part of the Commencement celebration, The Gradaute School published a Commencement page containing video messages from me and  from Stephany Santos, a PhD candidate in Biomedical Engineering. There’s also a complete list of master’s and doctoral degree recipients from August 2019, December 2019, and May 2020, including 9 PhD recipients from my home department, Ecology & Evolutionary Biology.

  • Annette Evans
  • Kaitlin Ann Gallagher
  • Christopher Nadeau
  • Kristen Nolting
  • Nasim Rahmatpour
  • Anna Rose Sjodin
  • Lauren Stanley
  • Katherine Taylor
  • Tanisha Marie Williams

I highlighted Kristen and Tanisha, because they happen to be my students. Tanisha is the Burpee Postdoctoral Fellow at Bucknell University (working with another UConn EEB alum, Chris Martine), and Kristen will begin as a postdoctoral research associate at the University of Georgia later this month (working with Lisa Donovan and John Burke).

On the off chance that you’re interested in seeing what I had to say and you don’t want to click through the Commencement page link, I’ve embedded my message below.


View Video with Transcript