In 2001, Nicholas Gotelli and Robert Colwell published a review paper in Ecology Letters that flagged common mistakes that ecologists were making in measuring and comparing species richness, and suggested ways to avoid such mistakes. Even today, 15 years after its publication, Gotelli and Colwell 2001 remains a must-read for anyone interested in richness and diversity research. I interviewed Nick Gotelli on the making of this paper, and the impact it has had, on his research career, and the field of species diversity studies.
Date of interview: 16th June 2016 (on Skype)
Citation: Gotelli, N. J., & Colwell, R. K. (2001). Quantifying biodiversity: procedures and pitfalls in the measurement and comparison of species richness. Ecology letters, 4(4), 379-391.
Hari Sridhar: I’d like to start by asking you what your motivation was to write this paper.
Nicholas Gotelli: There were actually two motivations. In a way, I had written an abbreviated version of this paper in a chapter on rarefaction in my 1996 book with Gary Graves Null Models in Ecology. That chapter laid out some of the ideas from the literature that were already there. The second thing happened a little closer to the date of the paper. I was working with my graduate student Declan McCabe (now a Biology Professor at St. Michael’s College), analysing an experimental set of data on the effects of disturbance frequency, intensity, and area on the biodiversity of stream invertebrates. We were interested in aspects of the Intermediate Disturbance Hypothesis (IDH) – trying to get at the question of whether disturbance increases or decreases species richness. It turns out that the answer depends entirely on whether you use a rarefaction approach or not. Basically, you can get opposite answers depending on the exact way that you measure species richness. So, I think it was these two things that got me started on the review. Those analyses appear in an Oecologia paper in 2000.
I had also started collaborating and talking again to my old mentor Rob Colwell. This paper is the first one we wrote together, but Rob was my undergraduate ecology instructor at the University of California, Berkeley, in 1979, where he taught an upper division course in Community Ecology, which had a huge influence on my career path. So, a few years later, when I began working on A Primer of Ecology, I started corresponding with Rob again, because my lecture notes from his course (which I still have) were critical for the chapter on interspecific competition. That book was what brought Rob and me back into contact, and then I began working on the rarefaction problem for the null models book and again continued talking to Rob. Of course by this time, Rob had published key papers on asymptotic species estimators, which are also an important part of this story.
HS: There are two ways in which Ecology Letters gets submissions for its Review articles section: unsolicited and through invitations. Which was it in the case of this paper?
NG: This is a long time ago and I’m actually a little vague on the history here. Michael Hochberg was in charge of Ecology Letters at that time and, to be honest, I cannot remember whether he solicited us or we went to him with this idea. We may have gone to him, I am not sure. I think we may have set up the agreement to do it, but the pieces were already there. One of the interesting things about this paper is there is very little in it that was new, even at the time it was published. In that sense, I think it is kind of an odd paper, because it’s really talking about ideas from the literature that were 20 or 30 years old.
HS: Can you tell us a little about the writing process itself – did you get together with Rob to write this, or was it done mostly over email?
NG: No, as with almost all my papers with Rob, everything was done over email. And in the late 1990s, there was no “track changes” feature in MS Word, but there was text highlighting. We would each use two highlight colours – one highlight colour was for your new text and one highlight colour was for your comments on the existing text. When you received a copy from the other author you went through, and for the new text, if you liked what they wrote, you “decolorized” it, as we used to say. I remember manuscript drafts would get huge in size with the 4 rainbow colours of highlighting, but gradually it would shrink down as we finalized the wording.
Rob is a wonderful collaborator and correspondent, and I think our comments, in some cases, ended up being longer than the actual text. He really is a fabulous writer. He takes great care in the craft of writing, and a lot of the time was spent reworking particular sentences and organizing the different sections. That’s how it always is writing with Rob. In a way, it was not a hard paper to write at all. As I said, most of the things were already in the literature in one place or another, but we both felt that ecologists were not paying attention to this older literature. They were not taking into account some of these really basic issues on how to standardize and quantify species richness.
HS: Did this feeling come from seeing a lot of papers using richness measures inappropriately at that time?
NG: Yes, there were lots of papers that were, you know, calculating diversity indices, species per unit area etc. – these types of calculations. Some of this was in the conservation literature, where people were trying to extrapolate species-area curves to estimate extinction rates and things like that. But making these linear re-scalings always causes problems for species diversity measures.
HS: How long did it take – from when you had the idea to write this paper, to its publication?
NG: It wasn’t that long. I am going to put it at somewhere between six to nine months.
HS: Did it sail through review?
NG: Yes, I remember that the reviews were pretty positive. We had a few suggestions from reviewers for improvement, but there was no controversy surrounding the paper at all. The reviewers seemed to appreciate the thorough look at the history of this topic, as well as the fact that the paper had some fairly specific advice on how people should go about trying to analyse species richness and species diversity.
NG: Yes it was. Rob had been developing his EstimateS software, and I was in the early phases of developing the original EcoSim with Gary Entsminger. At that time, those programmes were fairly novel. There wasn’t as much software available then as there is today. This was well before the days of R and the open-source revolution, so there certainly weren’t that many pieces of free software available. We were excited about that aspect of it as well – to have these computing tools that we could introduce at the same time when we talked about the theory.
HS: Am I right in saying that this is one of the most-cited papers in ecology, at least in the recent past? Did you anticipate the paper to attract so much attention and so many citations?
NG: Yes, the paper has been very well-cited. But we did not anticipate it would become so popular because, as I said, from our perspective this was only a review article. What’s more interesting than the total citation count (2666 citations as of 4 July 2016) is the profile of citations through time. From 2001 through 2014, each year this paper received more citations than the previous year, and it is still getting 100 to 200 new citations every year. That’s a very long half-life. Most things in the literature now are so short-lived that you wonder what their impact is going to be, and if they are actually going to be cited in a few years. That this paper continues to be well-cited after 15 years is very satisfying for Rob and me.
HS: Do you have a sense of what your paper gets cited for, mostly?
NG: The Ecology Letters review seems to get cited in 3 kinds if papers. First, papers where people are developing new theory and models for estimating species richness. Second, it is cited in plenty of empirical papers that are specialised on sampling of particular taxa. And third, it’s cited in papers that are talking about the fact that diversity indices are so sensitive to the amount of sampling that takes place.
HS: In the 15 years since this paper was published, have you ever had the need to go back to this paper? Or is this the first time?
NG: I’ve occasionally gone back to the discussion of individual-based and sample-based rarefaction. The paper introduced a useful taxonomy and description of how to talk about these sampling designs. One of the novel things introduced in the paper is the idea of adjusting or shifting the sample-based rarefaction curves to put them back onto a scale of abundance. I think it’s very useful for teasing apart the effects of abundance per say versus the shape of the species-abundance distribution. And that’s a good way to work with sample-based data.
HS: When you read this paper now, what strikes you about it? Is the writing style different from the papers you write today?
NG: No, I don’t think my writing is very different from today. This is how I usually end up writing, especially when I’m working with Rob. He brings out great clarity in the writings of his co-authors. He never takes anything for granted and he doesn’t want anything to hide behind jargon. So when you are writing with him you have got to explain yourself clearly and thoroughly. He won’t let anything get by that’s not crystal clear in the writing. Those of us who collaborate with him are very lucky to have had that influence on our own writing.
HS: What kind of impact do you think your paper has had? Do you think there has been a reduction in inappropriate usage of richness measures after your paper was published?
NG: Yes, in terms of reducing mistakes, I think it has had a positive impact. Reviewers, today, routinely expect authors to standardize or rarefy their data when comparing biodiversity. So I think that message has gotten out. One of the great things about working with Rob on this paper was that it led to a long-term collaboration with Rob and Anne Chao. That resulted in some very important theoretical extensions and new forms of analysis based on the rarefaction idea. Since the 2000s, there has been a kind of a renaissance in the literature on how to estimate and measure species diversity, and there are still lots of new and interesting papers coming out on that topic. I guess I would say to people who are reading our Ecology Letters review for the first time – recognize and pay attention to this whole new literature that came after this paper!
HS: One of the things you emphasize in your paper, towards the end, is the need for more research on asymptotic estimators. Has that happened?
NG: Yes. The estimator that is still the most popular and widely used is the set of Chao estimators. They are popular for a couple of reasons. First, they are easy to calculate. Second, they have very good statistical grounding, interestingly based on the computer science theorems from Alan Turing’s work during World War 2. Turing cracked the Enigma– the German coding machine – by developing theorems and formulas to estimate the frequency of undetected categories. And that’s exactly the basis for Anne Chao’s estimators. These estimators have performed pretty well in comparison with alternatives such as jack-knife estimators. However, there is still room for improvement in rarefaction and asymptotic estimators.
One thing people don’t like about rarefaction is that, in a way, it forces you to throw away all your data that is above your worst sampling level, in order to make everything comparable. And no one wants to throw data away. So in 2012, in the Journal of Plant Ecology, Rob and Anne spearheaded a paper that conceptually and statistically unified rarefaction with the asymptotic estimator. In that paper, we showed that the rarefaction curve, which is the interpolated part, can be smoothly extended and linked up to the asymptotic estimator, which is the extrapolated part. So instead of having to worry about which particular part of the curve you are going to compare, you can actually visualise the entire curve and the uncertainty that’s associated with it. I thought that was pretty powerful.
Anne Chao, in collaboration with Lou Jost, and also independent work by John Alroy, has gotten at the idea of coverage-based rarefaction and sampling: it’s not just the number of individuals that matters, but where you are on the rising part of the rarefaction curve. And your adjustments can be made that way.
The asymptotic estimators – the Chao estimators – are great, but, as Anne Chao has repeatedly written, these are only minimum asymptotic estimators. Some people have been unsatisfied with them when applied to “big data”, such as genomic datasets or surveys of hyper-diverse microbiomes. There is research ongoing to develop better estimators to use with hyper-diverse faunas. These are just some of the extensions that have followed from the framework laid out in the 2001 Ecology Letters paper.
HS: Does the statistics of diversity and species richness continue to be one of your primary research interests?
NG: Yes. Thanks to this first paper with Rob, I have been collaborating with Anne Chao and other researchers on a number of projects related to biodiversity estimation. For example, in a 2015 Ecological Monographs paper, led by Luis Cayeula, we dissect the effects on rarefaction curves of changes in species richness vs. changes in species composition. I am also in a working group led by Jon Chase at iDiv, where we are developing a set of steps for comparing diversity curves in replicated experiments or diversity surveys. So biodiversity statistics is still a very active research area for me.
HS: Was this the first review paper you wrote?
NG: It may have been the first review paper I wrote, but it came just after the null models book, which ended up being a huge review on all sorts of different topics in community ecology. The literature at that time was not as vast as it is now, so by the time Gary Graves and I finished the null models book, I felt I had a handle on the community ecology literature as a whole. Unfortunately, I don’t feel that way anymore.
HS: Do you enjoy writing review papers?
NG: Yes. They are a chance to pull things together and organise them and make a case for how things ought to be done. I think the goal of a review paper is not simply to exhaustively step through everything that’s in the literature, but to organise it, simplify it, and pull it together. Basically, to provide a framework for it. It’s not just this collection of papers that are cited one after another in a linear sequence. There is a certain development to the ideas that are contained in them, and I enjoy organizing that material into an explicit framework.
HS: Was the material presented in this paper also published in other forms later on – e.g. as a book chapter?
NG: Not that I am aware of. But there’s so much secondary publication that goes on these days, it could be that it has happened and I don’t know! In 2014, Rob and I contributed a chapter to the excellent book Biological Diversity edited by Anne Magurran and Brian McGill, in which we provide an overview covering some of the same topics on the estimation of species richness. That’s a little bit more up-to-date, as it includes recent improvements of the estimators and the theory that Anne Chao has developed since the Ecology Letters review
HS: Is this among your favourite papers?
NG: Yes, but I feel a little twinge of guilt about its popularity because we were mostly reviewing ideas that were 20 or 30 years old. I’m grateful that it has been cited well, and I’m pleased because I think it has had a positive effect on how people analyse biodiversity data. And, of course, all my papers with Rob are near the top of my list of favourites. But this one seems odd to me because of its relatively small content of truly new information.
HS: Thanks Nick. That covers all that I wanted to ask you. Before we end, I have to tell you that all the three textbooks you have written – A Primer of Ecology, A Primer of Ecological Statistics and Null Models in Ecology – have had a big influence on my research. I keep going back to them even now. Thanks for that too.
NG: That’s very flattering, thank you. You know, it’s funny. I was working on those books as an untenured assistant professor, and I had some senior faculty members ask me whether I knew what I was doing – was it a smart thing to be writing books at this stage in my career? In retrospect, it was a very smart thing to do. Unfortunately, I’m probably better known for those books than for my research papers!
The statistics book grew out of a graduate course I taught without any computers. It was all done on the chalkboard. Many of the ideas for that book actually came from my notes for that course. For that book, I joined forces with another wonderful collaborator, Aaron Ellison, whom I have worked with for the past 20 years on the ecology of carnivorous plants and temperate ants. The stats primer was a lot of fun to write with Aaron, and we completed the first drafts very quickly.
In 2012, Aaron and I published a second edition to the book in which we added two chapters, one on mark-recapture data and one on the estimation of biodiversity. And in 2008, I published a 4th edition of the Primer of Ecology that added a new chapter on biodiversity estimation. So, I am happy that all 3 of those books that you have mentioned have key chapters that are built on the framework developed in the 2001 Ecology Letters paper.