The Problem With “Best Practices” in Research (And Why Context Matters More)

“Best practices” has become one of those phrases that no one really argues with anymore. It shows up in grant calls, reviewer guidelines, PhD training, journal checklists. It sounds reasonable. It sounds responsible. It sounds like something that should not need defending. That alone should raise some suspicion.

In practice, a lot of research that follows best practices very closely is not especially good. It is tidy, careful, and ultimately uninformative. At the same time, a fair amount of work that makes people uncomfortable on methodological grounds turns out to be the work that actually moves understanding forward. This is not an argument against rigor. It is an argument against pretending that rigor has a single, stable form.

Where Best Practices Actually Come From

Most best practices were invented to fix something specific that had clearly gone wrong.

Preregistration became popular because people were running many analyses and reporting only the ones that worked. Power calculations became standard because small samples were producing wildly unstable effects. Reporting guidelinesemerged because papers routinely left out decisions that mattered.

None of this was controversial at the time. These were practical responses to visible problems.

What changed is that the practices outlived the context they were designed for. The problems evolved, but the fixes stayed frozen. Over time, the reasons were forgotten and the procedures remained.

At that point, following the practice stopped being about solving a problem and started being about signaling that the paper belonged.

The Shift From Reasoning to Compliance

Many method sections now read like documents written under mild duress. Everything that is supposed to be there is there. The language is familiar. The order is predictable.

What is often missing is any sense that the authors thought hard about whether these choices actually helped answer the question.

This is not because researchers are lazy or confused. It is because deviation carries risk. A conventional decision that is weak rarely draws attention. An unconventional decision, even a well-justified one, often does.

Over time, this shapes behavior. Researchers learn what is safe. They learn what reviewers expect to see. They learn how to preempt criticism rather than how to clarify uncertainty.

The method section becomes less about explaining how the study works and more about demonstrating that it passes inspection.

Context Is Usually the Real Issue

Most methodological disagreements are framed as arguments about standards. In reality, they are arguments about fit.

Exploratory work is a good example. In many fields, especially those dealing with complex systems, researchers do not know what the right hypotheses are at the outset. They learn them by interacting with the data. Trying to force this process into a confirmatory mold often produces preregistrations that are technically correct and conceptually meaningless.

Research on rare conditions presents a different kind of mismatch. Sample sizes are small because populations are small. A power calculation can tell you that the study is underpowered. It cannot tell you what to do about that. Treating this as a methodological failure misses the point. The real question is how cautiously the results are interpreted, not whether they meet an idealized threshold.

Long-term ecological and climate research violates basic statistical assumptions all the time. Independence, control, stationarity. These are luxuries the system does not offer. Applying methods developed for laboratory experiments can create a false sense of precision while ignoring the uncertainties that actually matter.

These are not exceptions. They describe entire fields.

When Rigor Turns Into a Ritual

Anyone who reviews regularly has seen this pattern.

The preregistration exists, but the paper’s conclusions rely mostly on analyses that sit awkwardly beside it. The power analysis justifies the sample size, but only by assuming effects that are implausibly large. The robustness checks explore small variations while leaving the core assumptions untouched.

Everything looks right. Very little feels convincing.

This is not misconduct. It is something closer to box-ticking. The forms are filled out. The deeper issues remain.

These papers often make it through review because there is nothing obvious to object to. They also tend not to age well.

Why Judgment Is So Hard to Replace

Good scientific judgment is not something that can be taught quickly. It comes from seeing how methods fail, how results change under pressure, how confident conclusions quietly fall apart.

Institutions do not like this kind of knowledge. It is uneven. It is difficult to certify. It cannot be enforced with a checklist.

Best practices are attractive because they are visible. They can be required. They can be audited. They create the impression that rigor is being managed.

The cost is that procedural correctness slowly crowds out careful reasoning. Researchers learn how to look rigorous long before they learn how to think rigorously.

This is not a moral failing. It is an incentive problem.

No Single Template Works for Everything

Good research does not look the same across questions, fields, or stages of inquiry. Expecting it to do so is a mistake that is easy to make and hard to undo.

Best practices are useful when they are treated as tools that may or may not apply. They become harmful when they are treated as rules that must always be followed.

Science moves forward when methods are chosen because they make sense for the problem, not because they satisfy a template.

That distinction is easy to lose sight of. It matters more than most methodological debates acknowledge.

‍