Uncommon Ground

Academics, biodiversity, genetics, & evolution

Latest Posts

How to think about replication

Caroline Tucker reviewed a paper by Nathan Lemoine and colleagues in late September and reminded us that inferring anything from small, noisy samples is problematic.1 Ramin Skibba now describes a good example of the problems that can arise.

In 1988, Fritz Strack and his colleagues reported that people found cartoons funnier if they held a pen in their teeth than if they held it between their lips. Why? Because holding a pen between your teeth causes you to smile, while holding one between your lips causes you to pout. This report spawned a series of studies on the “facial feedback hypothesis”, the hypothesis that facial expressions influence our emotional states. It seems plausible enough, and I know that I’ve read advice along this line in various places even though I’d never heard of the “facial feedback hypothesis” until I read Skibba’s article.

Unfortunately, the hypothesis wasn’t supported in what sounds like a pretty definitive study: 17 experiments and 1900 experimental subjects. Sixteen of the studies had large enough samples to be confident that the failure to detect an effect wasn’t a result of small sample size. Strack disagrees. He argues that (a) using a video camera to record participants may have made them self-conscious and suppressed their responses and (b) the cartoons were too old or too unfamiliar to participants to evoke an appropriate response.

Let’s take Strack at his word. Let’s assume he’s right on both counts. How important do you think the facial feedback to emotions is if being recorded by a video camera or being shown the wrong cartoons causes it to disappear (or at least to be undetectable)? I don’t doubt that Strack detected the effect in his sample, but the attempt to replicate his results suggest that the effect is either very sensitive to context or very weak.

I haven’t gone back to Strack’s paper to check on the original sample size, but the problem here is precisely what you’d expect to encounter if the original conclusions were based on a study in which the sample size was small relative to the signal-to-noise ratio. To reliably detect a small effect that varies across contexts requires either (a) very large sample size (if you want your conclusions to apply to the entire population) or (b) very careful specification of the precise subpopulation to which your conclusions apply (and extreme caution in attempting to generalize beyond it).

(more…)

Please vote

I votedToday is election day in the United States. The polls opened in Connecticut 2 hours ago, and there was a long line at my polling place by the time I arrived at 6:15.

If you are a registered voter and you aren’t one of the 41 million who have already cast a ballot in early voting, please vote today. If you don’t know where to vote, the League of Women Voters Education Fund has compiled information on the location of polling places at Vote411.org. All you need to do is put in your home address, and you’ll get a street address, hours of operation, and a Google map showing where to go. In addition to the Presidential race, there are important state and local races around the country. Please take the time to make your voice heard. Representative government works only if we select representatives who reflect our values. Voting is how we express those values.

It appears that I dodged a bullet

A couple of years ago, I received an unsolicited invitation to participate in Benefunder, a sort of Kickstarter for scientists. I talked with the people running it a couple of times. They proposed a very intriguing idea: All I needed to do was come up with a snappy description of my research, some compelling images, and $500. They would promote my research as part of a portfolio that wealthy investors would contribute to both because they were interested in the research and because the contributions were structured in a way that provided substantial tax benefits, a donor-advised fund. I was tempted because it sounded like a very promising idea. In the end, though, I just couldn’t see investing $500 in the project. It seemed too unlikely that a donor would be interested in supporting the esoteric research that I do.

It appears that my skepticism was well founded.

[E]ven as Benefunder bulged with projects, donors remained scarce. “We were never able to get off the ground,” [Christian] Braemer [one of the Benefunder founders] says. Donor funds “were not willing to take the reputational risk [on] an unknown entity,” he says. And the firm received just a few “small transactions … a bit out of the blue.”

To stay afloat, Benefunder ramped up sales of the profiles and videos. In 2014 and 2015, it earned more than $660,000 this way but attracted just $62,000 in gifts, tax forms show. In late 2015, as the firm ran out of cash, it abruptly stopped recruiting researchers, left some videos unfinished, and laid off all but three of the 12 employees who worked for it and an allied firm. (Ambitious web fundraising startup fails to meet big goals, by Mark Harris, Science 354: 534; 2016  doi: 10.1126/science.354.6312.534 )

Terry Tempest Williams – Excerpt from a letter to Major John Wesley Powell

John Wesley Powell and ten other men loaded food and provisions into four boats in Green River, Wyoming on 24 May 1869. They followed the Green River to the Colorado and traveled through the Grand Canyon. One man left after the first month. Three more left in the third month. Those who remained finished the expedition on on 30 August. They were the first Europeans known to pass through the Grand Canyon. Powell was appointed the second director of the US Geological Survey. His scientific study of the southwestern United States convinced him that agriculture and deserts should not mix.(Wikipedia).

Terry Tempest Williams writes him a letter in The Hour of Land: A Personal Topography of America’s National Parks. This is an excerpt.

I have learned from your history Major Powell, that it is only through the power of our own encounters and explorations of the wild that we can cultivate hope because we have experienced both the awe and humility in nature. We can passionately enter in to the politics of place, even the realm of public policy and change it, if we dare to speak from the authority of our own residencies.

3-minute thesis @UConn @U21News

268_3mt2014In 2008, the University of Queensland started the 3-minute thesis competition, in which advanced doctoral students are challenged to summarize their dissertation research for a non-specialist audience in three minutes. As they put it on the 3-MT website,

An 80,000 word thesis would take 9 hours to present.

Their time limit… 3 minutes

UConn has sponsored a local 3-minute thesis competition since the fall of 2013. Each year we send a video recording of the winner of our local competition to a “virtual” competition sponsored by Universitas 21. Judges in the international competition award a first prize and a highly commended prize. In addition, visitors to the U21 website can vote for their favorite presentation, with the presentation receiving the highest number of votes being given the Peoples Choice award. More than 3400 votes were cast in this year’s competition, and I’m delighted to report that Islam Mosa, a PhD student in Chemistry at UConn, is the 2016 People’s Choice award winner. Take three minutes of your time and watch his presentation below. You will be inspired.

Happy Halloween!

Duarte Family Roadtrip

Duarte Design Halloween competition

It’s Halloween, and that means it’s time for the annual Duarte Design Halloween design competition. (Click on the image above to see all of the designs and to vote for your favorite.) According to the website, the winner will be announced at 1:00pm. (Presumably that is 1:00pm Pacific Daylight Time. PDT is UTC-7.) I don’t know how long voting is open, but you’ll clearly need to vote before then for your vote to count.

I didn’t vote for the design at the left. I voted for one called “The Race.” I think it’s very creative, but it is slightly risque, and since this site is on a university server, I didn’t think I should display that image here. You may not think it’s as clever as I do, but if you need a little fun this morning, head over to Duarte Design and check all of them out. If you’re there in time, vote for your favorite. If you’re not, just enjoy all of them (and see whether I picked a winner).

You get what you measure

Inside graduate admissions, by Julie PosseltLast December I saw a fascinating talk by Julie Posselt.1 She described work deriving from her PhD dissertation in which she sat in on meetings of doctoral admissions committees in a variety of disciplines at several different (and anonymous) elite private and public research university. She described how overreliance on “cut points” for GPA, GRE scores, or both led to admissions decisions that favored applicants from relatively privileged backgrounds. Even though the faculty making those decisions were almost uniformly committed to ensuring that they admitted doctoral students from a wide variety of backgrounds, the pool of admitted students was far less diverse than the pool of applicants. As she put it in a piece for Inside Higher Ed earlier this year: “Despite their good intentions to increase diversity, broadly defined, admissions work was laced with conventions — often rooted in inherited or outdated assumptions — that made it especially hard for students from underrepresented backgrounds to gain access.”

Why does this happen? Partly it’s because faculty aren’t aware of advice from the Educational Testing Service on how to use GRE scores properly.2 Partly, it’s because there are so many applicants to high-quality doctoral programs that admissions committees often use numerical screens to identify the small number of applicants worthy of close scrutiny.

Athene Donald points out another way in which relying on strict numerical criteria may be harmful to everyone, regardless of what their demographic, economic, social, or cultural background may be. She argues in the context of evaluating academics that in addition to the usual metrics of publication or creative activity and grant dollars (for those in fields where external funding is important), success as an academic should also include “building teams, seeing their students thrive and progress, working with people who sparked them off intellectually and seizing opportunities to try out new things and make new discoveries.”

The challenge, of course, is that you get what you measure. If we only measure publications and grants, that’s what we’ll get. If we want to encourage team building and student support, we have to measure those things and give them as much weight as the things we traditionally measure. If we can’t find numbers with which to measure them, we still need to find ways to assess them, because helping others gain the skills they need is what education is all about.

(more…)

Don’t be that dude

Several years ago, Dr. Acclimatrix (@Acclimatrix) published a list of “Handy tips for the male academic.” I just happened to run across it again this morning, and I thought I should pass it along. The advice she offers is as timely now as it was then. As she says:

Gender equality has to be a collaborative venture. If men make up the majority of many departments, editorial boards, search committees, labs and conferences, then men have to be allies in the broader cause of equality, simply because they have more boots on the ground. And, as much as I wish it weren’t so, guys often tend to listen more readily to their fellow guys when it comes to issues like sexism. I’ve also found that there are a lot of guys out there that are supportive, but don’t realize that many of their everyday actions (big and small) perpetuate inequality. So, guys, this post is for you.

The list includes 20 distinct pieces of advice. I’ve tried to follow all of them, but these are the ones I’m working on hardest right now:

3. Don’t talk over your female colleagues.

5. Make sure your department seminars, conference symposia, search committees, and panel discussions have a good gender balance.

6. Pay attention to who organizes the celebrations, gift-giving, or holiday gatherings.

7. Volunteer when someone asks for a note-taker, coffee-run gopher, or lunch order taker at your next meeting.1

15. Don’t leave it to women to do the work of increasing diversity.

19. Know when to listen. (more…)

I am a snoot

I have little trust in people who don't use the Oxford comma.

From grammarly.com

Last Friday I confessed to my obsession with grammar and usage. In response, Alex Buerkle (@disequilibber) passed along a link to a wonderful article by David Foster Wallace describing the state of the “language wars” in the early 2000s. If you’ve never heard of the language wars or of the epic battle between prescriptionists and descriptionists, you may not find the article all that interesting, but it really struck a chord with me. I am a snoot.

A SNOOT can be defined as somebody who knows what dysphemism means and doesn’t mind letting you know it.

OK. Maybe I’m not really a snoot. I had to Google “dysphemism” – a derogatory or unpleasant term used instead of a pleasant or neutral one, such as “loony bin” for “mental hospital” – and I probably won’t brag about knowing the definition now (and I doubt that it will enter my regular vocabulary). So maybe it’s more accurate to say that I have a lot of sympathy with snoots. If you want to understand that means, I’m afraid you’ll have to read Wallace’s article. Here’s the link: http://harpers.org/wp-content/uploads/HarpersMagazine-2001-04-0070913.pdf Bottom line: Grammar and usage matter, because they convey a lot about us. The dialect we choose to use says a lot about who we are and about who we think our audience is.

Reproducibility is hard

Last year, the Open Science Collaboration published a very important article: Estimating the reproducibility of psychological science. Here’s a key part of the abstract:

We conducted replications of 100 experimental and correlational studies published in three psychology journals using high-powered designs and original materials when available. There is no single standard for evaluating replication success. Here, we evaluated reproducibility using significance and P values, effect sizes, subjective assessments of replication teams, and meta-analysis of effect sizes. The mean effect size (r) of the replication effects (Mr = 0.197, SD = 0.257) was half the magnitude of the mean effect size of the original effects (Mr = 0.403, SD = 0.188), representing a substantial decline. Ninety-seven percent of original studies had significant results (P < .05). Thirty-six percent of replications had significant results; 47% of original effect sizes were in the 95% confidence interval of the replication effect size; 39% of effects were subjectively rated to have replicated the original result; and if no bias in original results is assumed, combining original and replication results left 68% with statistically significant effects. Correlational tests suggest that replication success was better predicted by the strength of original evidence than by characteristics of the original and replication teams.

Since then, reproducibility has gained even more attention than it had before. My students and I have been taking baby steps towards good practice – using Github to share code and data (and versions), using scripts (mostly in R) to manipulate and transform data, and making the code and data freely available as early in the writing process as we can. But there are some important things we don’t do as well as we could – I’ve never tried using Docker to ensure that all versions of the software we use for analysis in a paper are preserved, I’m as bad at writing documentation for what I’m doing as I ever was (but I try to write my code as clearly as possible, so it’s not too hard to figure out what I was doing.

I need to do better, but Lorena Barba (@LorenaABarba) had a article in the “Working Life” section of Science that made me feel a bit better about how far I have to go. Three years ago she posted a manifesto on reproducibility. In her Science piece, she describes how hard it’s been to live up to that pledge. But she concludes with some words to live by:

About 150 years ago, Louis Pasteur demonstrated how experiments can be conducted reproducibly—and the value of doing so. His research had many skeptics at first, but they were persuaded by his claims after they reproduced his results, using the methods he had recorded in keen detail. In computational science, we are still learning to be in his league. My students and I continuously discuss and perfect our standards, and we share our reproducibility practices with our community in the hopes that others will adopt similar ideals. Yes, conducting our research to these standards takes time and effort—and maybe our papers are slower to be published. But they’re less likely to be wrong.


Barba, L.A. 2016. The hard road to reproducibility. Science 354:142 doi: 10.1126/science.354.6308.142
Open Science Collaboration. 2015. Estimating the reproducibility of psychological science. Science 349:aac4716 doi: 10.1126/science.aac4716