Sorry folks, but the plural of anecdote is data

One of the reasons I started writing this blog was for science outreach. I like to discuss how we do science. Tonight I am going to go back to my roots and discuss an oft-misunderstood concept in science. The role of the anecdote, and through it the inherent value of simple individual observations in the scientific method. The background for this blog post is a discussion I had over the weekend about which of two mutually exclusive expressions is correct:

The plural of “Anecdote” is “Data”


The plural of “Anecdote” is not “Data”

I regularly hear people mistakenly claiming the latter is true, but the truth of the matter is that the former is true. In this blog post I want to explain why I believe anecdotes have a legitimate place informing our initial hypothesis development and how numerous, mutually-supporting, individual, observations represent a valuable information resource…or put in plain English that the plural of anecdote IS data.

To begin, a bit of background. The research project that led to my original PhD thesis research was done when computer databases were in their infancy. It was the end of the Green Plan in Canada and the Canadian government had spent hundreds of millions of dollars collecting baseline data on all sorts of phenomena, but the money wasn’t there to effectively archive the data. I was a recently graduated analytical chemist hired by a research group at the University of Victoria that was looking at thousands of analytical results collected throughout the Strait of Georgia and stored in filing cabinets in numerous federal research facilities. The question our research group was asked to answer: How do we avoid simply wasting all this data when the funding for the Green Plan ran out?

Our decision was to develop a geospatial database which assembled all the information in a location, and format, useful for future researchers. But we were left with a question: how do we help future users understand the strengths and limitations of the data in the database? We had to ask ourselves what differentiates mere observations from useful data and how do we document that difference for future users?

Surprisingly, we discovered that this topic had not been well-studied in the academic press. After months of scouring the academic literature, the best answer we could identify was a concept called “process knowledge“. To explain the concept of process knowledge, in that era, I’m going to steal a paragraph from my earlier work:

From a policy perspective, scientific information consists of two distinct sub-components: scientific data or measurements, and process knowledge. The former are obvious – the levels of a particular contaminant in a fish liver, the nature of the lesions provoked on the fish liver, and the mortality effects of a particular contaminant on fish in bioassays are all examples of scientific data or measurements. Process knowledge describes “the dynamics and interrelationships within natural biophysical and social systems” (Cornford and Blanton, 1993). Process knowledge may be theories, hypotheses, mathematical models, or even suspected correlations culled from a body of observations. Both data and process knowledge are required as inputs to decisions, but in different proportions depending on the questions being addressed.

So now we can see the difference between scientific data and simple observations? It is the existence of an underlying understanding of the process knowledge that links a set of observations to the physical world. That being said, a well-documented observation, when put into the correct context, can graduate from a simple anecdote to a useful data point.

So what is an anecdote? Anecdotes are individual accounts or stories. They are in effect observations unconnected to any specific process knowledge. As such they represent an information resource that can be used, with appropriate scrutiny, to help us understand the world around us.

Going back to the history of science we all remember the stories of scientists making general observations that informed their hypothesis development. The classic example is Newton’s apple. Now whether Newton observed an apple fall from a tree and used it to help him understand gravity is apocryphal or literal is not the point, it demonstrates a point. In the development of new theories and hypotheses, the first observations are almost always done absent detailed process-knowledge.

Observations, often collected for some completely different purpose, form the first step in understanding our shared environment. Reports of these observations (anecdotes) form the original data points used for virtually every investigation. Sure, not all anecdotes are of equal value; for every story of a useful anecdote, I can provide a story of an anecdote that was misconstrued. I have an entire blog post (Risk Assessment Epilogue: Have a bad case of Anecdotes? Better call an Epidemiologist) dedicated to situations where anecdotal information led to unsupported hypotheses. But that is the point, a single anecdote is not typically terribly useful, but numerous, similar anecdotes can inform the development of a hypothesis.

For centuries indigenous peoples collected their observations (usually in oral form). These oral stories represent “anecdotes” by the standard definition of the term and were historically dismissed as not particularly valuable because they were collected in the absence of any systematic evaluation of a specific phenomena or to address specific hypotheses. Yet now virtually the entire scientific community agrees that traditional ecological knowledge (TEK) provides invaluable insights into our shared ecological heritage. The hundreds of years of historical narratives and observations (anecdotal observations) are indeed data by any fair view of the concept.

Let’s take a specific example. The anecdotal reports from generations of fishermen and coast watchers allow us to generate an effective estimate of the extent of the historical habitat for the Southern Resident Killer Whales (SRKWs). Those compiled anecdotes were not systematically collected for the purposes of establishing the extent of the SRKWs’ habitat but those anecdotes do just that. When numerous, independent observers all document the presence of SRKWs off Haida Gwaii it confirms that this location is part of their historical range. The assembled anecdotes, collected for totally different reasons, are data in the context of establishing these historical extents.

So are all anecdotes useful? Absolutely not. But take a lot of observations that are similar in nature (or mutually-supporting) and suddenly you have the baseline data necessary to start developing a hypothesis. Anecdotes can, and do, provide a valuable information source at the initiation of a scientific investigation of a phenomena, or put another way: the plural of anecdote is indeed data.

Source of figure

This entry was posted in Uncategorized. Bookmark the permalink.

8 Responses to Sorry folks, but the plural of anecdote is data

  1. Chester Draws says:

    Yet now virtually the entire scientific community agrees that traditional ecological knowledge (TEK) provides invaluable insights into our shared ecological heritage.

    Insight is not data either.

    We can extract useful information from TEK, but any scientist that puts it on the same level as Western science is being politically rather than scientifically correct. (Of course you’re not allowed to say that in most modern academic institutions, but we all know it’s true.)

    Even in your anecdote about the whales you say “allow us to generate an effective estimate” . But we don’t know that is true. We may think it is an effective estimate, but we simply do not know, because we’re working with inherently unreliable material. Anecdotal evidence may be our best guess, but it isn’t nearly as good as hard data. It may, if fact, be utterly wrong.

    TEK isn’t even necessarily true, which rather spoils its usefulness even as anecdote. Traditional Western ecological knowledge included all sorts of rot (Geese being born from barnacles, matter being made of the earth/fire/air/water etc). If the Pacific Indians know so much, are we to acknowledge TEK and go hunting for a Cadborosaurus now?


  2. Rob Yearling says:

    I think you’ve mixed up your former and your latter.


  3. Ruud Hommel says:

    Wrote the text in MSWord and copy paste did not get the desired result.
    This time without formatting, let’s hope it eventually does not show up three times.

    The initiation for this response lies with Artemisia.

    Not being a scientist, I’m at a disadvantage here.

    Of course, everybody intuitively “knows” what science is and what the word means.
    So I checked a few Wikipedia pages and dictionaries, just to be sure, and came to the conclusion that science is a western invention (sic) and basically is the activity of an organized collection of data from which hypotheses may be drawn, which then have to be proven true by planned, recorded, objective experiments, executed by (under supervision of-) a professional scientist, prior to publication.
    If not so published, it’s not science.

    I had to go through the above as Blair’s post, for me at least, uses a lot of words to well explain the glaringly obvious.

    Although I am a great believer in science, one may well have reservations with the regard to the scientific world in general and its financial manipulation specifically.
    These reservations start with the admission that Blair’s PhD originated from a lack of funding for some science problem.

    (1) Scientific misconduct incidents, mainly caused by requirement for funding.
    Maybe not so much in the field of “social sciences”, where political motivations
    may dominate (always happy to get this one in  ).
    (2) Corporations and governments provide funding for science. I have no fear that the
    selection process is aimed for maximum return, financially, politically or otherwise.

    Note: Chester’s comment concerning the Cadborosaurus obviously falls in the minimum return category, but I can happily report that my two children were delivered by a member of the Ciconiidae family.

    What I consider worse is that Artemisia anecdotes (plural=data?) also fall in the same category.
    Even though there has been done some (western type) scientific work on the Artemisia-malaria relation and good empirical research results have been obtained, main stream science does not get funding on the subject.

    No doubt that there are numerous other Artemisia-like cases in very many fields of science which remain “anecdotal”, simply because the very rich western world has too few possibilities to fund the required science. Really?

    So, I would conclude that on moral grounds and beyond the miscreants of (1), there are few hero scientists in the community who have the guts to stand up and demand this very necessary funding.
    Could Malthus play a role here?

    Have fun.

    Sub. (1)
    Sub. (2)


  4. James Joyce says:

    an·ec·dote /ˈanəkˌdōt/ noun: a short amusing or interesting story about a real incident or person.
    Many short amusing stories are not data – sorry – no one thinks that except maybe a sociologist and if you are one of those – then apologies in advance.


    • Blair says:

      Read the article, then try to comment intelligently.


      • Chris R says:

        Of course anecdotal information can become data but it, like your example of the historical range of SRKWs, it needs to be normalized and placed within a larger framework in order to be useful. The phrase “The plural of anecdote is not data” is far more about people assuming that a pile of stories somehow becomes useful information without any other action. A case in point would be the spread of misinformation regarding vaccination via social media. “I hear a lot about this so it must be true.”

        So both phrases can absolutely be true but both come with caveats.


  5. William Lewis says:

    Thank you so much for this. The knee-jerk rejection of anecdotes as “not data” has always had me facepalming. I think this has something to do with data itself being placed on a pedestal, forgetting that all data comes with intrinsic reservations as to its quality, and it is this quality that must be examined in context and in light of other data in order to arrive at a conclusion about whatever it is you are trying to determine.

    For example, an oft-ridiculed argument by flat-earthers is that the world looks flat to a person looking out at the horizon. It is easy to forget that this observation is actually data; but we have a wealth of other data, and models that account for this observation, that together convince us that the world is, in fact, round.

    It may not be often that anecdotal data can be overwhelming enough to cast doubt on a pronouncement by the scientific community, but it is not outside the realms of possibility. One could imagine scientists putting out a conclusion, for whatever reasons, that gravity exerts force in an upward direction. I would hope that anyone listening to such a pronouncement would reject it out of hand, on the basis of an overwhelming tide of anecdotal data that we are not, in fact, flying off of the (flat) planet. But this gets way too much into the arena of politics for my taste, so I will just conclude this comment by pointing out that “There are five lights!”

    Sorry, four. My bad.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.