In 2017, I got an email from an anthropologist commenting on a new report in the Proceedings of the Royal Society. The topic of that report was Bigfoot—or rather, a genetic analysis of hairs that people over the years have claimed belong to a giant, hairy, unidentified primate.
The international collaboration of scientists, led by University of Oxford geneticist Bryan Sykes, found no evidence that the DNA from the hairs belonged to a mysterious primate. Instead, for the most part, it belonged to decidedly unmysterious mammals such as porcupines, raccoons, and cows.
My correspondent summed up his opinion succinctly: “Well, duh.”
This new paper will not go down in history as one of the great scientific studies of all time. It doesn’t change the way we think about the natural world, or about ourselves. But it does illustrate the counterintuitive way that modern science works.
People often think that the job of scientists is to prove a hypothesis is true—the existence of electrons, for example, or the ability of a drug to cure cancer. But very often, scientists do the reverse: They set out to disprove a hypothesis.
It took many decades for scientists to develop this method, but one afternoon in the early 1920s looms large in its history. At an agricultural research station in England, three scientists took a break for tea. A statistician named Ronald Fisher poured a cup and offered it to his colleague, Muriel Bristol.
Bristol declined it. She much preferred the taste of a cup into which the milk had been poured first.
“Nonsense,” Fisher reportedly said. “Surely it makes no difference.”
But Bristol was adamant. She maintained that she could tell the difference.
The third scientist in the conversation, William Roach, suggested that they run an experiment. (This may have actually been a moment of scientific flirtation: Roach and Bristol married in 1923.) But how to test Bristol’s claim? The simplest thing that Fisher and Roach could have done was pour a cup of tea out of her sight, hand it to her to sip, and then let her guess how it was prepared.
If Bristol got the answer right, however, that would not necessarily be proof that she had an eerie perception of tea. With a 50 percent chance of being right, she might easily answer correctly by chance alone.
Several years later, in his 1935 book The Design of Experiments, Fisher described how to test such a claim. Instead of trying to prove that Bristol could tell the difference between the cups of tea, he would try to reject the hypothesis that her choices were random. “We may speak of this hypothesis as the ‘null hypothesis,’ ” Fisher wrote. “The null hypothesis is never proved or established, but is possibly disproved, in the course of experimentation. Every experiment may be said to exist only in order to give the facts a chance of disproving the null hypothesis.”
Fisher sketched out a way to reject the null hypothesis—that Bristol’s choices were random. He would prepare eight cups, putting milk first into four of them, and milk second into the other four. He would scramble the cups into a random order and offer them to Bristol to sip, one at a time. She would then divide them into two groups—the cups that she believed had received milk first would go in one group, milk second in the other.
Bristol reportedly passed the test with flying colors, correctly identifying all eight cups. Thanks to the design of Fisher’s experiment, the odds that she would divide eight cups into two groups correctly by chance were small. There were 70 different possible ways to divide eight cups into two groups of four, which meant that Bristol could identify the cups correctly by chance only once out of every 70 trials.
Fisher’s test couldn’t completely eliminate the possibility that Bristol was guessing. It just meant that the chance she was guessing was low. He could have reduced the odds further by having Bristol drink more tea, but he could never reduce the chances she was guessing to zero.
Bigfoot advocates have repeatedly claimed that professional scientists are willfully ignoring compelling evidence.
Since absolute proof was impossible, Fisher preferred to be practical when he ran experiments. At the lab where he and Bristol worked, Fisher was charged with analyzing decades of collected data to determine whether that information could divine details, like the best recipe for crop fertilizer. Scientists could use that data to design ever larger experiments with increasingly more accurate results. Fisher thought it would be pointless to design an experiment that needed centuries to yield results. At some point, Fisher believed, scientists had to just call it a day.
He believed that a sensible threshold was 5 percent. If we assumed that the null hypothesis was true and found that the odds of observing the data was less than 5 percent, then we could safely reject it. In Bristol’s case, the odds were comfortably below Fisher’s threshold, at just 1.4 percent.
Thanks in large part to Fisher, the null hypothesis has become an important tool for scientific discovery. You can find tests of null hypotheses in every branch of science, from psychology to virology to cosmology. And scientists have followed Fisher in using a 5-percent threshold.
Which brings us back to Bigfoot.
People have been claiming they’ve seen hairy humanoids for decades. They’ve offered up grainy photos, ambiguous casts of footprints, and enigmatic clumps of hair. In recent years, they’ve even tried to extract DNA from the hair, but scientists have dismissed these genetic studies because they didn’t involve standard safeguards routinely used in such research.
Bigfoot advocates have repeatedly claimed that professional scientists are willfully ignoring compelling evidence. The problem, in fact, is that the advocates haven’t been approaching the question of Bigfoot in a scientific fashion. So two years ago Sykes and his colleagues decided to run a scientific study of those hairs from an “anomalous primate.” And that involved creating a null hypothesis to try to reject.
The null hypothesis they developed was this: The hairs purported to come from Bigfoot (or the Abominable Snowman or other regional varieties of the creature) belonged not to a previously unknown primate, but to known mammals. They extracted DNA fragments from 30 different hair samples and were able to isolate the same short stretch of DNA from each. They then compared that stretch to the corresponding stretch of DNA sequenced from many living mammals.
The results were clear: The scientists found precise matches for all 30 samples in previously known mammals.
Does this mean Sykes and his colleagues have proved that Bigfoot does not exist? No. It simply means that Sykes, unlike Fisher with his tea test, could not reject the null hypothesis. The question remains open, and—if Bigfoot doesn’t exist—always will.
That’s not to say Sykes’ study didn’t offer its own surprises. Two hair samples from the Himalayas matched a DNA sequence that was extracted from a 40,000-year-old fossil of a polar bear. Stranger still, their DNA was not a match to living polar bears.
In their report, Sykes and his colleagues offer a scenario for how such a result could have come about. It’s possible that ancient polar bears and brown bears interbred, and some living bears in the Himalayas still carry a bit of that ancient polar bear DNA.
Some skeptics have offered up an alternative explanation for Sykes’ finding. It’s possible that the polar bear-like DNA actually comes from a living mammal—perhaps a brown bear—that happened to pick up a few mutations that created a false resemblance to that ancient polar bear DNA.
What these skeptics have done, in effect, is create a null hypothesis. And there’s a straightforward way to set about disproving it. Scientists would need to find more DNA from these mysterious bears. If other regions of the DNA also matched ancient polar bears, then scientists could reject the null hypothesis.
And so science carries on, from one null hypothesis to another.
Carl Zimmer is a columnist for The New York Times and the author of 12 books, including A Planet of Viruses.