Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

People don't investigate things like that, because, frankly, no one cares. Nor should they. If your computer is acting funny, and you find out a spider has made a nest inside it, and it works fine once you clear out the spider nest, would you then decide to determine exactly why that spider nest, in that place, causes the exact problems you observed? No, of course not. Because that's not an interesting question. Computers are complicated machines, and they can break in lots of different ways, most of which are not very interesting in their details.

Similarly, suppose you study cultured cells (which are notoriously finicky), and you want to compare what the cells do in the presence of drug X vs control. But at first, you find that all of your control cells die. And eventually you find out that if you use brand X of bottled water vs tap water, the control cells thrive. Are you seriously proposing that you should then drop all work on drug X, and get to work determining exactly what is up with the tap water in your town? I mean, maybe that would be a fruitful research avenue, if you're worried that the tap water isn't safe for human consumption, or you think that there's something interesting about exactly how the tap water is killing your control cells. But most of the time, investigating the tap water would be an expensive distraction from the question you actually want to answer. And most scientists, I think, would (reasonably) decide to get on with investigating the effects of drug X on their cells, and not worry too much about precisely why the tap water killed the control cells. And I don't think there's anything wrong with that. Life is short, and you have to choose your questions carefully.



So you just accept for no reason that tap water is bad somehow, and discard the result you've just gotten?

I do understand that you have a limited amount of time, and can't just go after everything, but when something happens in science, it needs to be documented. Yeah, maybe someone else should investigate, but someone should. Maybe that particular phenomena that lead to the water influencing your result will give you knowledge about cell metabolism. Who knows? If it has that much of an effect on cell growth that you need to deal with it, it's already more active than a lot of compounds we try out, anyway...

To go back to the computer analogy, it feels like my program is bugged, and to debug it I'm changing variables names (which as far as I know shouldn't matter), and then the code magically works again. Sure, some days, I'll go "Ok, compiler magic, got it", but most days I'd be pretty intrigued, and I'd look into it, because yeah, I might just have found a GCC bug.

I agree, no one cares, but I did. I don't know what I don't know yet, and I don't want to presume anything. The tap water thing might actually lead us to solid models which would explain why tap water breaks the experiment. That's why I really think we should start a movement of publishing everything, and trying to deal with simpler models/systems we do understand before going up to models with so many unknowns that the results are basically a dice roll.


This makes me think of Feynman's comments on Cargo Cult Science:

"In 1937 a man named Young did a very interesting [experiment]. He had a long corridor with doors all along one side where the rats came in, and doors along the other side where the food was. He wanted to see if he could train the rats to go in at the third door down from wherever he started them off. No. The rats went immediately to the door where the food had been the time before.

The question was, how did the rats know, because the corridor was so beautifully built and so uniform, that this was the same door as before? Obviously there was something about the door that was different from the other doors. So he painted the doors very carefully, arranging the textures on the faces of the doors exactly the same. Still the rats could tell. Then he thought maybe the rats were smelling the food, so he used chemicals to change the smell after each run. Still the rats could tell. Then he realized the rats might be able to tell by seeing the lights and the arrangement in the laboratory like any commonsense person. So he covered the corridor, and still the rats could tell.

He finally found that they could tell by the way the floor sounded when they ran over it. And he could only fix that by putting his corridor in sand. So he covered one after another of all possible clues and finally was able to fool the rats so that they had to learn to go in the third door. If he relaxed any of his conditions, the rats could tell.

Now, from a scientific standpoint, that is an A-number-one experiment. That is the experiment that makes rat-running experiments sensible, because it uncovers that clues that the rat is really using-- not what you think it's using. And that is the experiment that tells exactly what conditions you have to use in order to be careful and control everything in an experiment with rat-running.

I looked up the subsequent history of this research. The next experiment, and the one after that, never referred to Mr. Young. They never used any of his criteria of putting the corridor on sand, or being very careful. They just went right on running the rats in the same old way, and paid no attention to the great discoveries of Mr. Young, and his papers are not referred to, because he didn't discover anything about the rats. In fact, he discovered all the things you have to do to discover something about rats. But not paying attention to experiments like that is a characteristic example of cargo cult science."

http://neurotheory.columbia.edu/~ken/cargo_cult.html


"So you just accept for no reason that tap water is bad somehow, and discard the result you've just gotten?"

What is the "result" that you referring to here? That the tap water in town X kills cultured cells of type Y? Yeah, I guess you could try writing that up and publishing it, but that's a good way to waste a lot of time publishing results that are interesting only to a very small audience. Honestly, if it were me, I'd send an e-mail to people in the same town that might be working with cells of the same or similar type, and then move on.

"but when something happens in science, it needs to be documented"

No, it really doesn't. Stuff doesn't document itself. That takes time, which sometimes is better spent doing other things. Like answering more interesting questions.

"Maybe that particular phenomena that lead to the water influencing your result will give you knowledge about cell metabolism. Who knows?"

That's true, but my point is that it doesn't make you a bad scientist if you shrug your shoulders about why the tap water kills your cells, and get on with your original experiment. For every experiment you do, there's a million others you're not doing, and so it makes sense to focus on the one experiment you're most interested in, not chase after a bunch of side-projects that will (probably) not lead to any kind of an interesting result.

And all of this is very different than a case where you compare control to treatment, find no difference, and therefore just start fiddling with other experimental parameters until you do get a difference between control and treatment. That is, I think you'll agree, a bad way to do science.

"To go back to the computer analogy, it feels like my program is bugged, and to debug it I'm changing variables names (which as far as I know shouldn't matter), and then the code magically works again. Sure, some days, I'll go "Ok, compiler magic, got it", but most days I'd be pretty intrigued, and I'd look into it, because yeah, I might just have found a GCC bug."

I think a better analogy would be if compiler x acts in ways you don't understand, so you switch to gcc, which (most of the time) works as you expect. Are you really required to figure out exactly why compiler x acts as it does? Or would you just get on with using a compiler that works the way you think it should?

"I agree, no one cares, but I did. I don't know what I don't know yet, and I don't want to presume anything. The tap water thing might actually lead us to solid models which would explain why tap water breaks the experiment."

They might, but there are a lot of experiments that have a very very small chance of an interesting outcome, and a near-one chance of a pedestrian outcome. You can do those experiments, and you might get lucky and get the interesting result, but probably you will just get the pedestrian result. And there's nothing wrong (and a lot right) with instead focusing on experiments where (for instance), either outcome would be interesting to people in the field.

"That's why I really think we should start a movement of publishing everything, and trying to deal with simpler models/systems we do understand before going up to models with so many unknowns that the results are basically a dice roll."

I think you're conflating a number of things here. I agree that a reductionist approach to science has bourne a lot of fruit, historically. I agree that studying systems with a lot of unknowns has risks. And it may be that "publish everything" would work better than what we have now. But even if scientists all decide to publish everything they do, they'll still have to make strategic choices about what experiment to do on a given day, and in many cases that will mean not doing a deep dive into questions like "why does the tap water kill my cells, even in control"?


I think we'll have to agree to disagree on most of those points then.

I do not think there are trivial/uninteresting questions. You have to prioritise, but you can't just sweep stuff under the rug and call it a day. I'm not even using the "it might be huge!" argument, just that science is about curiosity. Most math won't end up as useful as cryptography, but it doesn't matter.

I do think that it is part of your job, as a scientist, to document what you do, and what you observe. If a software engineer on my team didn't document his code/methodology correctly, he'd be reprimanded, for good reason. Yeah, it takes time, but it's part of the job. This way, we avoid having 4 people independently rediscovering how to set up the build tools.


Do you not keep notes as you do experiments?

    * tap water---control failed
    * bottled water (generic store brand)---control failed
    * distilled water---control succeeded
and then when writing up the experiment, mention the use of distilled water? You might not be interested in why only distilled water worked, but someone else reading the paper might think to investigate.


This! All that stuff should be in "Materials and Methods". Also sources for all reagents, with lot numbers. Etc, etc.


The problem with everything you've said is that statistical significance tests are almost always statistical power tests -- do you have enough statistical power given the magnitude of the effect you've seen. The underlying assumption of something like the p-value test is that you are applying to p-value test to all known data sampled from an unknown distribution.

If it is standard laboratory procedure to discard results that are aberrant and to repeat tests, and then to apply the p-value test ONLY to the results that conform to some prior expectation, then the assumptions underlying the p-value test are not being followed -- you're not giving it ALL the data that you collected, only the data that fits with your expectations. Even if this is benign the vast majority of the time -- if 99.9% of the times you get an aberrant result are the result of not performing the experiment correctly -- using the p-value test in a way that does not conform to its assumptions increases the likelihood of invalid studies being published.


"That's why I really think we should start a movement of publishing everything, and trying to deal with simpler models/systems we do understand before going up to models with so many unknowns that the results are basically a dice roll."

I would love to see this implemented, and encouraged in every lab around the world! It's not like we don't have computer programs that could collate all the data?

I don't think I will ever see this happen; because the truth is Not what companies want. They want a drug/theory they can sell. It's a money game to these pharmaceutical companies in the end. Companies, and I believe many researchers want positive results, and will hide, cherry pick the experiments/studies that prove their hypotheses? I know their must be honest, addenda free researchers out there, but I have a feeling they are not working for organizations with the money to fund serious, large scale projects?

Take for instance, Eli Lilly--whom has a history of keeping tight control over their researchers. The history of Prozac is a good example of just how money produces positive results;

"Eli Lilly, the company behind Prozac, originally saw an entirely different future for its new drug. It was first tested as a treatment for high blood pressure, which worked in some animals but not in humans. Plan B was as an anti-obesity agent, but this didn't hold up either. When tested on psychotic patients and those hospitalised with depression, LY110141 - by now named Fluoxetine - had no obvious benefit, with a number of patients getting worse. Finally, Eli Lilly tested it on mild depressives. Five recruits tried it; all five cheered up. By 1999, it was providing Eli Lilly with more than 25 per cent of its $10bn revenue."

(1) I love how Prozac was tested on mild depressives. Don't current prescribing guidelines only recommend the administration of Prozac for the most seriously ill--the clinically depressed? Actually--no, it's still recommended for for a myriad of disorders? Wasn't Prozac proved to be only slightly better than placebo? If you dig deeper, their are some sources that don't see any benefit over placebo.

(2) Wouldn't patients/society benefit from seeing all of the studies Eli Lilly presented to the FDA? Not just the favorable ones? How many lives would have been saved if this drug was given an honest evaluation--if every study was published, and put through statistical calculations in the 90's? Think about the millions of patients who suffered terrible side effects of this once terrible expensive drug? Think about the birth defects that could have been prevented?

So yes, I would love to see everything published, but I don't think the business cycle/politics will ever allow it? They want results! They want money! They want endowments! It's a sick game, until you are the one needing help. Help that only honest, good science can produce?

http://www.theguardian.com/society/2007/may/13/socialcare.me...


Making sure that everyone working in the field knows not to use tap water seems worth doing, though, even if the reason why isn't understood yet. It sounds like replication is a problem because this sort of knowledge isn't shared widely enough?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: