Usenet discussions sometimes contain demands that claims be documented by peer reviewed papers. It is natural to ask how much credence we can place in such sources. Are there, perhaps, biases that are not acknowledged?
Evaluating the reliability of peer review is tricky because it really does depend on the area of research. Basically a research paper should tell the reader what was done, how it was done, what the conclusions were, and how they were arrived at. It should be complete enough so that another person could reproduce and check everything from the measurements on.
Let’s suppose I’m going to referee a paper (I’ve done this and I’ve written papers). What do I look for? First of all, is the bibliography complete and correct? Papers aren’t texts – they aren’t self contained. Every thing in the paper should have a source. Either it should be data that is described in the paper or it should be from another paper or book. Part of my duties as a referee is to check that everything in the paper does have a source. More than that it should be a reliable source, at least for the purpose for which it is used in the paper. The idea is that you should be able to check everything back to its original source. Are the statements drawn from references correct?
Next I need to check whether the methodology is sound. The authors gathered some data or did an experiment. Is the methodology sufficiently well described that the experiment is repeatable? Does the methodology have flaws?
Thirdly I have to follow the lines of argument that lead up to the conclusions. Are they sound? Are they logically valid? If there are equations or derivations are they done correctly? Does the argument use anything which is not supplied in the paper?
All of this is what I would call ordinary scholarship. It’s basically a matter of making sure that all of the bits and pieces are there, that the mechanics of writing a solid paper have been observed.
The final question is whether the paper is significant and relevant, i.e., whether it has anything of importance to say and whether it is appropriate for the journal in which it is to be printed.
Usually the referee’s task would be hopelessly formidable unless the referee isn’t already familiar with the field. This is idea behind peer review – the referees should be people who are already familiar with the literature in the field, the experimental techniques used.
Peer review is ordinarily done blind – this means that the author is not told who the reviewers are. Some are done double blind – in double blind reviewing the reviewers don’t know who the author is either. Double blind is better because it removes an element of bias. Note however that in many fields everybody is familiar with everybody else’s work and they don’t have any trouble figuring out who the reviewers and the authors are.
The process of getting a paper published cycles through several stages of review. First the reviewers read the paper and make their comments. These are fed back to the author who revises his paper to make suggested corrections and improvements and to answer criticisms that he or she feels are incorrect. When this has all been shaken out the paper is either rejected (a common fate) or published.
All of this should sound very good – all very painstaking with all of the possible sources of error made as small as possible. And in fact the process is pretty good. How could it go wrong? Easily. Here are some ways:
One final problem is that a good paper has two kinds of results – those that are definitely established by the data in the paper, e.g., the T-rex tooth was measured to be 14.3 cm long, and tentative conclusions, e.g., “we argue that T-rex was a carrion scavenger rather than an active carnivore”. Tentative conclusions are supported by evidence and a line of argument that is not conclusive. Most of the arguments in science are about the tentative conclusions. (Surprise!) However they are very important because, when enough evidence is pieced together, they make the transition from hypothesis to well established fact.
Now all of this works pretty well in the “hard” sciences. Things are not so rosy in “soft” sciences – psychology, sociology, et cetera. In the hard sciences they have the trick of focussing down to a narrow area; it is legitimate to do this because they can eliminate irrelevant factors. This trick doesn’t work in the soft sciences – people and their institutions are very complicated. What is more these complications cannot be reliably broken up into separate independent pieces.
What this means is that our problems (2) and (3) are par for the course. As a direct consequence the “tentative conclusions” tend to be wilder, less solidly reasoned and less solidly supported. There are two common responses to this situation:
There is a further problem in these areas. They often become politicized. What usually happens then is that the data, the methodology, the references and the arguments are all manipulated to produce the desired result. Peer review breaks down because all of the peers share the same political agenda. A lot of the work on gender differences and gender roles is politicized.
This page was last updated October 29, 1997.