Prologue: I have been working on this blog post about failure for months and kept cutting bits out of it because it was so long and boring. This is actually the second-to-last version, because I cut out a bit too much in the end, and that’s how I knew it was done. (I wish they could do that at the hairdresser: “No, this is too much, can you just undo that last bit?”)
The post originated from a session I held at BioBarCamp in August of this year. There was another part to the story – that of the value of alternative careers – but I left that out of this. I’ll probably get back to it at some point. Because I took so long writing it, I also added things that I did not mention in Palo Alto, such as the Douglas Prasher story.
During the writing process I needed to purge some other thoughts, which led to this earlier blog post in October, Intro to Failure.
I’m completely bored with this post now, having looked at it on and off for the past 5 months, but here it is:
Science is an extremely competitive line of work, and the unit of success is the publication record. You need many publications in good journals to get good jobs and to get funded. These good journals publish interesting work. To be interesting, you also need to be original. Original experiments are those that have not been done before, and that means that you don’t know, going into it, if they even work or not. Whether a new type of experiment will work on the first try or not has nothing to do with how smart the scientist is or how hard they work. It’s luck. But the person who is so lucky to get everything to work right away will be the first to publish. Meanwhile, someone who wasn’t as lucky on the first attempt will have switched methods, talked to a lot of people to figure out how to get the experiment to work, read some more papers, tried other experiments, and still has not even started to test their initial hypothesis, let alone think about publishing. At some
Whether a new type of experiment will work on the first try or not has nothing to do with how smart the scientist is or how hard they work. It’s luck. But the person who is so lucky to get everything to work right away will be the first to publish. Meanwhile, someone who wasn’t as lucky on the first attempt will have switched methods, talked to a lot of people to figure out how to get the experiment to work, read some more papers, tried other experiments, and still has not even started to test their initial hypothesis, let alone think about publishing. At some
Meanwhile, someone who wasn’t as lucky on the first attempt will have switched methods, talked to a lot of people to figure out how to get the experiment to work, read some more papers, tried other experiments, and still has not even started to test their initial hypothesis, let alone think about publishing. At some point they might give up, and study something else, at which point they have to start at square one.
Are they any less smart than the person who already published their paper? No. Have they worked any less hard? No. They might even have worked harder. Do they have any less experience? Again, no: they may have looked into more techniques than the person who got everything to work right away, and they certainly have a lot more experience in troubleshooting problems! But that doesn’t count. No publication means failure in this line of work.
It is this publish or perish pressure that drives some people to extreme measures in an attempt to keep their jobs. My favourite example of this is William Summerlin, who, in 1974, painted mice with Sharpies and claimed that it was a successful skin transplant. There have been several other reported cases of fraud, and those are just the ones that were caught. Many others commit scientific fraud and get away with it. Worse, many scientists commit fraud without even realizing they do so.
The pressure is not just to publish, but to publish positive data. But scientific data, especially in biology, are usually not simply positive or negative. There is a spectrum of possibilities, and while some results are very obviously negative or positive, most are somewhere in the middle. Where do we draw the line? By convention we use statistics.
Results from biological experiments are considered significant (positive results) when the P-value is smaller than 0.05, and not significant (negative results) when the P-value is bigger than 0.05. A P-value of 0.05 means that the probability of randomly getting the same result (or better) as you got in that experiment is 5%. If the P value is 0.049, the chance of randomly getting that result (or better) is 4.9%, and if P equals 0.051 the chance is 5.1%. Of these examples only P=0.049 would be considered statistically significant and publishable.
Do you think it really matters, biologically speaking, if the P value that comes out of your calculations is 0.049 or 0.051? Nah.
A few years ago, a study looked at the reported P-values in political science papers. The social sciences also use the rule that “positive data” are those with a P-value just under a certain cutoff point. If you plot the number of times a reported P-value is at a certain distance below or over the cutoff value, you would expect there to be a bell curve distribution, where very little values are really low and most of them are closer to the cutoff. Because of a bias towards positive data, you would even expect there to be more values under the cutoff than over. But what was found was not only a very high peak just under the cutoff value, but also a dip just above it:
(It shows distance from critical value (c.v.), which is 1.64 for one-tailed tests, and the dotted line is the critical z statistic for p=0.05. This explains it better.)
There were less P-values just a little bit too high than there were P values much too high. The blog post I linked to suggests that this happened because when people found a P value of 0.051 they just repeated the study until they got 0.049 — a value that is publishable. It also suggests that this only happened near the cutoff point.
Such practices undoubtedly occur in other fields as well. Is this bad? Not for the authors: they got their paper out. Not for research in general: results with P=0.051 are no less relevant than those with a P value of 0.049. These are no Sharpie-coloured mice.
But the fact that people feel the need to fudge their data (because that’s what the non-reporting of the P=0.051 values is) to publish and to secure their jobs, that is bad. That means there is something wrong with the system by which success in science is determined.
There are other clues that something may be wrong. If the system worked, people with the potential of being the best and most necessary people in their field would not lose their funding. You might say that we will never know: once they’re gone, how do you determine if they were the best in their field?
Well, you can’t, really. Unless that person has left an obvious legacy: Two of the winners of this year’s Nobel Prize in Chemistry could not have done their work if they had not received the DNA for Green Fluorescent Protein (GFP) from Douglas Prasher, the guy who initially isolated the gene. Sure, if he hadn’t done it, someone else would have, but Prasher did it first, and that’s what counts, right?
Prasher no longer works in science. His grant money ran out, and he is now driving a shuttle bus for a car dealership in Huntsville, Alabama. If someone working on groundbreaking, Nobel-worthy research cannot keep his lab running, how is anyone else supposed to be able to do so?
If only positive, exciting data are a measure of success – as expressed in the unit of The Published Paper In A Good Journal – then a lot of things get left out. Incremental projects, such as cloning a gene (but not doing many groundbreaking experiments with it) are relevant. And if the difference between positive data and negative data are for a large part based on luck, isn’t science more a gamble than a career?
(This was based in part on a session I ran at BioBarCamp, on August 7, 2008, in Palo Alto. Thanks to Michael Nielsen for telling me the statistics story (twice), and thanks to the many people who have discussed scientific publishing with me, both online and offline.)