Uncommon Ground

Academics, biodiversity, genetics, & evolution

Latest Posts

How to organize data in spreadsheets

I recently discovered an article by Karl Broman and Kara Woo in The American Statistician entitled “Data organization in spreadsheets” (https://doi.org/10.1080/00031305.2017.1375989). It is the first article in the April 2018 special issue on data science. Why, you might ask, would a journal published by the American Statistical Association devote the first paper in a special issue on data science to spreadsheets instead of something more statistical. Well, among other things it turns out that the risks of using spreadsheets poorly are so great that there’s a European Spreadsheet Risks Interest Group that keeps track of “horror stories” (http://www.eusprig.org/horror-stories.htm). For example, Wisconsin initially estimated that the cost of a recount in the 2016 Presidential election would be $3.5M. After correcting a spreadsheet error, the cost climbed to $3.9M (https://www.wrn.com/2016/11/wisconsin-presidential-recount-will-cost-3-5-million/).

My favorite example, though, dates from 2013. Thomas Herndon, then a third-year doctoral student at UMass Amherst showed that a spreadsheet error in a very influential paper published by two eminent economists, Carmen Reinhart and Kenneth Rogoff, magnified the apparent effect of debt on economic growth (https://www.chronicle.com/article/UMass-Graduate-Student-Talks/138763). That paper was widely cited by economists arguing against economic stimulus in response to the financial crisis of 2008-2009.

That being said, Broman and Woo correctly point out that

Amid this debate, spreadsheets have continued to play a significant role in researchers’ workflows, and it is clear that they are a valuable tool that researchers are unlikely to abandon completely.

So since you’re not going to stop using spreadsheets (and I won’t either), you should at least use them well. If you don’t have time to read the whole article, here are twelve points you should remember:

  1. Be consistent – “Whatever you do, do it consistently.”
  2. Choose good names for things – “It is important to pick good names for things. This can be hard, and so it is worth putting some time and thought into it.”
  3. Write dates as YYYY-MM-DD. https://imgs.xkcd.com/comics/iso_8601.png
  4. No empty cells – Fill in all cells. Use some common code for missing data.1
  5. Put just one thing in a cell – “The cells in your spreadsheet should each contain one piece of data. Do not put more than one thing in a cell.”
  6. Make it a rectangle – “The best layout for your data within a spreadsheet is as a single big rectangle with rows corresponding to subjects and columns corresponding to variables.”2
  7. Create a data dictionary – “It is helpful to have a separate file that explains what all of the variables are.”
  8. No calculations in raw data files – “Your primary data file should contain just the data and nothing else: no calculations, no graphs.”
  9. Do not use font color or highlighting as data – “Analysis programs can much more readily handle data that are stored in a column than data encoded in cell highlighting, font, etc. (and in fact this markup will be lost completely in many programs).”
  10. Make backups – “Make regular backups of your data. In multiple locations. And consider using a formal version control system, like git, though it is not ideal for data files. If you want to get a bit fancy, maybe look at dat (https://datproject.org/).”
  11. Use data validation to avoid errors
  12. Save the data in plain text files
  1. R likes “NA”, but it’s easy to use “.” or something else. Just use “na.strings” when you use read.csv or “na” when you use readcsv.
  2. If you’re a ggplot user you’ll recognize that this is wide format, while ggplot typically needs long format data. I suggest storing your data in wide format and using ddply() to reformat for plotting.

New version of RStudio released

If you use R, there’s a good chance that you also use RStudio. I just noticed that the RStudio folks released v1.2 on April 30th. I haven’t had a chance to give it a spin yet, but here’s what they say on the blog:

Over a year in the making, this new release of RStudio includes dozens of new productivity enhancements and capabilities. You’ll now find RStudio a more comfortable workbench for working in SQL, Stan, Python, and D3. Testing your R code is easier, too, with integrations for shinytest and testthat. Create, and test, and publish APIs in R with Plumber. And get more done with background jobs, which let you run R scripts while you work.

Underpinning it all is a new rendering engine based on modern Web standards, so RStudio Desktop looks sharp on displays large and small, and performs better everywhere – especially if you’re using the latest Web technology in your visualizations, Shiny applications, and R Markdown documents. Don’t like how it looks now? No problem–just make your own theme.

You can read more about what’s new this release in the release notes, or our RStudio 1.2 blog series.

I look forward to exploring the new features, and I encourage you to do the same. Running jobs in the background will be especially useful.

Microscale trait-environment associations in Protea

If you follow me (or Nora Mitchell) on Twitter, you saw several weeks ago that a publish before print version of our most recent paper appeared in the American Joiurnal of Botany. This morning I noticed that the full published version is available on the AJB website. Here’s the citation and abstract:

Mitchell, N., and K. E. Holsinger.  2019.  Microscale trait‐environment associations in two closely‐related South African shrubs. American Journal of Botany 106:211-222.  doi: 10.1002/ajb2.1234

Premise of the Study
Plant traits are often associated with the environments in which they occur, but these associations often differ across spatial and phylogenetic scales. Here we study the relationship between microenvironment, microgeographical location, and traits within populations using co‐occurring populations of two closely related evergreen shrubs in the genus Protea.
Methods
We measured a suite of functional traits on 147 plants along a single steep mountainside where both species occur, and we used data‐loggers and soil analyses to characterize the environment at 10 microsites spanning the elevational gradient. We used Bayesian path analyses to detect trait‐environment relationships in the field for each species. We used complementary data from greenhouse grown seedlings derived from wild collected seed to determine whether associations detected in the field are the result of genetic differentiation.
Key Results
Microenvironmental variables differed substantially across our study site. We found strong evidence for six trait‐environment associations, although these differed between species. We were unable to detect similar associations in greenhouse‐grown seedlings.
Conclusions
Several leaf traits were associated with temperature and soil variation in the field, but the inability to detect these in the greenhouse suggests that differences in the field are not the result of genetic differentiation.

Announcing a new platform for BioOne

Some of you know that I serve as Chair of the Board of Directors for BioOne, a non-profit publisher founded in 1999 with the goal of ensuring that non-profit publishers in the life sciences receive the revenue they need to support their journals while keeping the subscription cost to libraries affordable. We now publish more than 200 journals from 150 scientific societies and independent presses on BioOne Complete.

Earlier today we announced that BioOne Complete launched on a new website made possible through collaboration with SPIE, the international society for optics and photonics. Here’s a copy of the press release:

BioOne Complete launches on new platform powered by nonprofit partnership

Released: January 2, 2019

Washington, DC — BioOne (about.BioOne.org), the nonprofit publisher of more than 200 journals from 150 scientific societies and independent presses, has launched a new website for its content aggregation, BioOne Complete. Powered by a nonprofit collaboration with SPIE, the international society for optics and photonics, the new site leverages SPIE’s proprietary platform for the benefit of BioOne’s more than 4,000 accessing libraries and millions of researchers around the world.

The new site (remaining at bioone.org) was designed with the needs of today’s researchers in mind. The modern and intuitive interface allows for enhanced searching and browsing, and simplified off-campus access. My Library features allow researchers to easily organize and access relevant articles and alerts, drawing from BioOne Complete’s database of more than 1.5 million pages of critical content.

“We are delighted to launch the new BioOne Complete website and share the redesigned interface and expanded functionality with our community, ” said Susan Skomal, BioOne President/CEO. “Our collaboration with SPIE has yielded not just a strong not-for-profit partnership, but a leading-edge website that helps better promote the important research of BioOne’s publishing participants.”

For more information about this transition and features available on the new website, please visit the BioOne Help Desk, Resources for Librarians and Administrators, or Resources for Publishers.

###

About BioOne

BioOne is a nonprofit publisher committed to making scientific research more accessible. We curate content and support discourse while exploring new models in scientific publishing. BioOne’s core product is BioOne Complete, an online aggregation of subscribed and open-access titles in the biological, ecological, and environmental sciences. BioOne Complete provides libraries with cost-effective access to high-quality research and independent society publishers with a dynamic, community-based platform and global distribution. about.bioone.org.

Celebrating 50 years of the H. Fred Simons African American Cultural Center @UConn #aacc50th

Cover of the program for the AACC 50th Anniversary GalaI was privileged to attend to 50th anniversary celebration of the H. Fred Simons African American Cultural Center on Saturday night, and to sit next to Dr. James Lyons, Sr., a UConn alum and the first director of the Center. You can see a few photos that were posted during the event on Twitter. I was also asked to say a few words during the celebration. Here’s what I said:

Thank you Willena.

It is a pleasure and a privilege to greet you tonight, although it is a little odd to welcome you when you’re already eating dessert. It is also dangerous for anyone to give me a captive audience, so I also congratulate Willena on her courage in trusting me, and I promise that I will be brief. I know that the real program comes after me, and I also understand that there may be a party you want to get to.

We live in frightening times, but 1968 (when the African American Cultural Center was started) was also a frightening time. Our country was embroiled in the Vietnam War, student protests were exploding, and our cities were burning. There were riots at the Democratic National Convention, Bobby Kennedy was assassinated, and on April 4th the Reverend Dr. Martin Luther King, Jr. was gunned down in Memphis.

But 1968 was also a year of hope and promise: The Civil Rights Act was signed into law, the 3rd season of Star Trek featured the first interracial kiss on national TV, and perhaps most important of all, LL Cool J was born on January 14.

1968 was also the year when students, faculty, and staff at UConn came together to establish the African American Cultural Center.

For the last 50 years, the Center has been a vital part of campus life at UConn. Its dedication to cultural preservation, leadership, and academic excellence is a vital part of making UConn one of the nation’s leading public universities.

As a nation we were founded on the principle that all people are created equal and that we all have a right to life, liberty, and the pursuit of happiness. I don’t need to tell anyone here that we have often fallen short of this lofty principle. Indeed, I need only to mention the names of Michael Brown, Eric Garner, or Laquan McDonald to remind us how far we have to go.

But at a time when violent political rhetoric seeks to divide us, the work of the African American Cultural Center is more important than ever. It enriches us all by showcasing the culture, history, and traditions of people of African descent. It binds us together as people and inspires us to imagine a future in which everyone is valued for their unique contribution and in which the culture, history, and traditions of all people are treated with the respect they deserve.

I am honored to play a small part in celebrating the Center’s 50th anniversary this evening, and I am delighted to have the privilege of welcoming you to this celebration.

Thank you.

You SHOULD…Read:Orwell, Leopold, and Teale

The UConn Humanities Institute asked me to contribute to their “You Should…” series. Here’s a copy of my contribution.

You should…Read: Orwell, Leopold, and Teale

But not the Orwell you think. Read  Politics and the English language to be reminded that “Political language…is designed to make lies sound truthful and murder respectable, and to give an appearance of solidity to pure wind” and Shooting an elephant for a concrete example of how “when the white man turns tyrant it is his own freedom that he destroys.”[1] Read Leopold’s A Sand County Almanac to learn that when Canada geese return north in the spring “the whole continent receives as net profit a wild poem dropped from the murky skies upon the muds of March” and the many things a poor farm can teach those willing to learn. Read Teale’s A Naturalist Buys an Old Farm to learn Leopold’s lessons in our own backyard on a farm in Hampton.

[1] And for the best first sentence in an essay: “In Moulmein, in Lower Burma, I was hated by a large number of people–the only time in my life that I have been important enough for this to happen to me.” WARNING: Descriptions in the essay would have offended many in 1936. More will find them offensive now.

The Mindset List for the Class of 2022

20 years ago Ron Nief, emeritus Director of Public Affairs, at Beloit College created the Mindset List. Every August since then the Beloit Mindset List has been a feature of higher education in the US. It’s been maligned (http://www.beloitmindlessness.com/2018/08/19/more-of-the-same/,http://www.beloitmindlessness.com/2018/08/21/here-we-go-again/) and it’s been parodied (https://www.theonion.com/a-look-at-the-class-of-2019-1819592320), but as I wrote a couple of years ago “I always get a kick out of looking it over. It reminds me of how old I am.”

I’m a couple of years older now than I was a couple of years ago, and I still get a kick out of looking the list over. Here are a few of the items that I found especially striking:1

  • Among the iconic figures never alive in their lifetime are Victor Borge, Charles Schulz, and the original Obi-Wan Kenobi Alec Guinness. That last one really hurts. I remember seeing the original in a movie theater.
  • They have grown up afraid that a shooting could happen at their school, too.
  • Presidential candidates winning the popular vote and then losing the election are not unusual.
  • There has never been an Enron.

It you want to read all 60, here’s the link: http://themindsetlist.com/2018/08/beloit-college-mindset-list-class-2022/.

  1. The comments in bold italic are my commentary.

A few thoughts on how to structure a scientific paper

I mentioned last week that I’m reading Williams & Bizup, Style: The Basics of Style and Grace. Yesterday I came across this very succinct advice for the early stages of writing a paper and thinking about how to structure it.

When you plan a paper, look for a question that is small enough to answer but is also connected to a question large enough for you and your readers to care about.

If you’re a scientist and writing a paper,1 you already have the data and most or all of the statistical analyses done. So the “look for a question” part has to happen twice in writing a scientific paper.2 You need to “look for a question that is small enough to answer but is also connected to a question large enough for you and your readers to care about” before you begin collecting data. Then you need to collect data that will answer that question.

Science being what it is,3 after you’ve collected the data you’ll find that there are data you couldn’t collect that you wanted to collect4 and there are data you collected that you didn’t anticipate collecting. In writing the paper you now have to look at the data you have in hand, identify a question that the data in hand can answer that is connected to a larger, interesting question, and (this is the hard part) write the paper using only the data that answer that larger, interesting question. If you’re like me, you5 will have collected other data that don’t fit in this paper. That doesn’t mean they’re useless, and it doesn’t mean you should discard them. It merely means that they’re not useful for this paper. With any luck you’ll find that they are useful for another paper that you’ll write in the future.

  1. Or at least if you’re a scientist like me and writing a paper.
  2. Or at least it has to happen twice if you’re me.
  3. Or at least science being what it is in the way that I do it.
  4. Especially if your research involves work in the field.
  5. In the interest of full disclosure, I have to point out that I almost never collect data myself. It’s my students and collaborators who collect the data. Even when I’m in the field, I mostly hold the field notebook and write down the measurements someone else is making. I rarely make the measurements myself. The closest I usually come to collecting data myself is collecting samples from which someone else derives data.

A few thoughts on writing (inspired by Williams & Bizup, Style)

I do not claim to write well, but I have been writing for nearly 40 years, and I’ve been helping students with writing for more than 30. Along the way I’ve figured out a few things that work for me, so I thought I’d pass a few of them along. Keep in mind that I have no training, and I have no credentials suggesting that anything I write is worth reading. If you find something useful here, use it. If you don’t, ignore it. Better yet, if you find something here you think is fundamentally misguided, leave a comment so that others won’t be misled.

Nearly 30 years ago I bought a copy of Wiliams, Style: Lessons in Clarity and Grace. I’ve referred to it frequently ever since. On Wednesday I bought the Kindle edition of Williams & Bizup, Style: The Basics of Style and Grace. I’m just getting started on it, but I can already recommend it. It’s a shorter version of Lessons, but even in the shorter version there’s a lot here that anyone who writes can use.

The first and most important lesson is not to worry about style or principles of style at all until you’ve written something down. The only bad first draft is the first draft you haven’t written. Before you worry about whether readers can understand you or whether what you’ve written will capture or hold their interest, write something down so that you can start revising it. This lesson took me a very long time to learn. When I was in graduate school I literally had to start writing with the first paragraph of the Introduction and then write every other paragraph in sequence until I was done. I also struggled to make every sentence and paragraph perfect as I was writing them, because that was how I imagined writers wrote. Even though I’d read and heard it before, it wasn’t until some time after I joined the faculty at UConn that I finally understood that 90 percent or more of writing is rewriting. As Williams and Bizup put it,

Most experienced writers get something down as fast as they can. Then as they revise that first draft into something clearer, they understand their ideas better. And when they understand their ideas better, they express them more clearly, and the more clearly they express them, the better they understand them—and so it goes, until they run out of energy, interest, or time.

They also point out that you can exercise your revising chops on other people’s writing. When you’re reading something that seems complicated and confusing, take a good, hard look at it and see if you can find a way to express the ideas more clearly. If you can, you’ll have the satisfaction not only of having worked out the meaning of that complicated thing, but also of knowing that you had the skill to make something understandable when the author couldn’t or wouldn’t do the Sam thing.

Just telling you to revise doesn’t help, of course. You have to know how to revise. The good news is that Williams and Bizup provide a set of principles that anyone can learn and apply. It’s not easy to apply them, and sometimes you won’t have the time to apply them, but keep in mind that

Everything that can be thought at all can be thought clearly. Everything that can be said can be said clearly. —Ludwig wittgenstein

BioOne is collaborating with SPIE and moving to a new platform

A few of you know that one of the hats I wear is that of Chair for the Board of Directors of BioOne. BioOne is a non-profit organization that provides low-cost access to journals in organismal and environmental life sciences while providing the society and non-profit publishers of journals in the BioOne collection with substantial revenue. Leadership of BioOne includes representatives of both the scholarly publishers and academic libraries. I have found it very rewarding to be associated with such a productive collaboration. The focus on low-cost access increases the availability of the journals to students and scholars everywhere. The focus on providing income to publishers ensures that they can continue to publish the journals. Working together, we help to ensure that scholarly communication within the fields represented in BioOne is accessible and sustainable.

Today BioOne is announcing a new collaboration with SPIE, the international society for optics and photonics. What does optics and photonics have to do with life sciences you ask? Well, like BioOne, SPIE is committed to providing electronic access to a wide audience, and BioOne’s journal collection will be hosted on a new, high-performance web site in collaboration with SPIE that launches on January 1, 2019. SPIE is already providing the technology behind the BioOne Career Center, and we look forward to working with them to provide even better access to journal resources than we do now and to develop new ways of serving the life science community that we haven’t even thought of yet.

Here’s the press release:

BioOne, the nonprofit publisher of more than 200 journals from 150 scientific societies and independent presses, has announced the forthcoming launch of a new website for its content aggregation, BioOne Complete. The new website, to launch on January 1, 2019, will be powered by a nonprofit collaboration with SPIE, the international society for optics and photonics.

This significant partnership leverages SPIE’s proprietary platform technology to meet the needs of BioOne’s community, including its more than 4,000 accessing libraries worldwide. The new BioOne platform (remaining at bioone.org) will give BioOne Complete a more modern and intuitive look and feel, while enhancing user functionality.

Lauren Kane, BioOne Chief Strategy and Operating Officer, notes, “This exciting partnership better positions BioOne for growth in the future, all while redirecting a major cost center to a fellow not-for-profit organization. SPIE has already proven to be a responsive and creative collaborator with an appreciation for BioOne’s mission and stakeholder needs. We are excited to share this news, and soon, our new site, with the community.”

Scott Ritchey, SPIE Chief Technology Officer, adds, “Our partnership with BioOne demonstrates the value that compatible, not-for-profit organizations can create when working together. The SPIE mission is better fulfilled with the shared insights and economies of scale created by our relationship with BioOne.”

BioOne’s goal is to ensure that this will be a seamless and transparent transition for all stakeholder groups. All aggregation content, subscriber licenses, and user profiles are being migrated to the new site. The BioOne team will be in touch throughout the fall with updates, required actions, and educational resources.