Post

Research Inconsistency and Coding Errors

Categories: Publishing

SUMMARY: I4R has found research discrepancies – the description of the analysis in the manuscript does not match the logic of the programming scripts evaluated on the provided data. This is a special case of a research inconsistency – mistakes, deviations from, or misrepresentations of, the process taken when producing research as understood by its intended audience. The severity of a research inconsistency ought to be evaluated on its potential to affect the conclusions of research.

Research inconsistencies are mistakes, deviations from, or misrepresentations of, the process taken when producing research as understood by its intended audience.

I4R has primarily focused on reproduction and assessing how a manuscript’s data analysis corresponds to said manuscript. We better understand this correspondence by encouraging careful investigation of replication folders and additional materials (appendices, pre-analysis plans, etc.). We are often exposed to “coding errors” which this blog discusses as research inconsistencies.

Inconsistencies are not limited to empirical work but arise in all fields of research. They can occur in logic-based work – like mathematics – where an inconsistency would change the deductions (or premises) required to make the same conclusions. Essays, too, may be built on misrepresentations of others’ ideas.

Research inconsistencies must change the interpretation of the research relative to how it is presented. It is challenging to speak about an inconsistency without being relative to what was “correct” (or “absent of error”). The inconsistency may leave one not knowing what the research consistent output might have been (had the process been followed as described or understood). Still, it may be possible to better understand the intended research and alter how one views what was research inconsistent.

Research inconsistencies have the potential to change the conclusions of the research. Deviations from the intended research process, if corrected, can range from innocuous to transformative. However, inconsistencies are not defined relative to the magnitude of their impact on the research. We do not speak of research inconsistencies only if it changes the conclusions of the research; we do not say research consistent only if it does not change the conclusions of the research. Research inconsistencies are simply a characteristic of the research materials.

Researcher decisions that are omitted, unobserved or unverifiable, need not be research inconsistent simply because there is no benchmark. An example would be when the producers of research omit information about their research process. The inclusion (or exclusion) of this material may change an audience’s view of the research, but it need not provide guidance on what ought to have been done.

Research inconsistencies need not be fraudulent. Research inconsistencies can be mistakes or random.

DEFINING THE SCOPE OF “RESEARCH” WHICH OUGHT TO BE “CONSISTENT”

Research is not often solely defined to a published manuscript. Research often includes accompanying appendices, folders of materials, pre-analysis plans, and reports from referees and journals. Perhaps more appropriately, I should speak of research projects. One could also seemingly include earlier drafts, conference presentations, data collection and additional sourcing materials, and even work that comes following publication. So, what parts of research projects ought to be consistent?

A practical minimum part of the research project may include: (1) the published manuscript, (2) whatever materials the published manuscript is built upon, and (3) whatever materials the manuscript points towards. The published manuscript is a given. The materials the manuscript is built upon includes data and code. These materials should be self-evident or understood by the intended audience. Materials the manuscript points towards include appendices, pre-analysis plans, and references. This is not to say everything else regarding the research project is not relevant. Rather, we are supposing a starting point for evaluating research consistency. We suspect other materials may be subject of debate and ought to be taken case-by-case.

Case-by-case materials could include (but certainly not limited): conference presentations, old drafts, data collection processes and additional information. A special case would be “follow-up” papers where two pieces of research overlap. It seems pragmatic to limit oneself to one piece of research, though one could assess overlapping projects jointly.

ALLOWANCE FOR MULTIPLE INCONSISTENCIES

One reason to limit the scope of research is to try and reduce the concern about too many inconsistencies (or, multiple comparisons). That is, without referring absolutely to one part of the research project (like a published manuscript), there could be many comparisons between the different parts of the research project. For example, there could be inconsistencies across three materials: cleaning code/scripts, analysis code/scripts, and the published manuscript. Each inconsistency may require “fixing” two other pieces of the research.

A corollary of multiple inconsistencies is that research projects need not be uniquely consistent. Again, this is due to the interplay between multiple parts of the research which (1) may be mutually exclusive but, (2) affecting the same materials. Making consistent one piece of the research project (in one frame of reference) may preclude consistencies in a different frame of reference.

This is a problem with having relative frames of reference: research consistent relative to what?

USEFUL FRAMES OF REFERENCE: WHAT CAN WE CHANGE?   

Research is constantly evolving with new evidence, methods and lenses. In theory, all components of a research project could be adjusted.

How I think of this is relative to written work. We often evaluate work relative to a pre-analysis plan or the published manuscript. Often this is taken as given: what is written is beyond our control. Even if the authors meant something else (i.e., they inaccurately described their coding decisions in their manuscript), we cannot change their manuscript; we can never rewrite the publication. The contrast would be that we might be able to make consistent other parts of their research project, like rewriting their scripts or their code.

EXAMPLES OF RESEARCH INCONSISTENCIES: SYNTAX ERRORS, LOGIC/SEMANTIC ERRORS, and HOW THEY DIFFER FROM DISCREPANCIES

  • Syntax Errors
    • These are the quintessential errors which occur because the syntax (structure, grammar, etc.) of the software is incorrect.
    • Usually, the programs cannot run because the software does not understand the script.
    • These are seldom the case for research accepted to journals who have reproducibility of statistical output a prerequisite for publication.
  • Logic/Semantic Errors
    • Scripts do not fulfill their intended purpose though there may be no syntax errors.
    • These can only be understood with additional information, such as what the program is seeking to accomplish.
  • Research discrepancy
    • We say there is a research discrepancy when the description of the analysis in the manuscript does not match the logic of the programming scripts evaluated on the provided data.
    • The point being that the logic of the manuscript is not executed in the analysis.
    • This could be classified as a subset of logic/semantic errors, where the logic is set out by the manuscript (or the script).

NON-EXAMPLES OF RESEARCH INCONSISTENCIES

  • Run-Time Error Exception: Broken File Paths
    • Instances where file paths lead to incorrect (or non-existent) directories need not be a coding error. While containers like Docker are ideal, many researchers in economics, political science and psychology use localized directories. Having to set a directory pathway to the local machine is not viewed as a deficiency of the replication folder but rather necessary to evaluate the scripts.
  • Run-Time Error Exception: Versioning Difficulties
    • Python and R are open-source programming languages which rely on user-maintained packages. Accessing the appropriate software versions, packages, and their likely interdependencies, is necessary for any researcher using these softwares. While replication folders should have a README telling prospective reviewers the software and package versions, it can be hard to track and is not a norm. Good practice would be to save the necessary versions and packages locally (like a container) and run a script which sets up the environment to reproduce the results. Commonly used packages may be forgotten and may be inferred by those wishing to reproduce the results found in a paper.
  • Transcription errors, like incorrect rounding or typos.
    • A replication folder for a published manuscript ideally produces all its exhibits as they appear in the journal. In reality, it is costly for researchers to produce exhibits in this detail. We accept that transcription errors may occur: certain values may not exactly match the output produced by the programs run. More formally, if a team of researchers cannot identify a pattern (i.e. systematically understand the differences between values presented in a paper and those presented by the program), then it is challenging to classify why there are differences. In general, we try to give researchers the benefit of the doubt and understand that fat-finger errors occur. It also may be outside of their control, or may be lost in the editorial process.

CONTRASTING RESEARCH DISCREPANCY WITH RESEARCH INCONSISTENCY

A research discrepancy (the description of the analysis in the manuscript does not match the logic of the programming scripts evaluated on the provided data) is a subset of research inconsistencies (mistakes, deviations from, or misrepresentations of, the process taken when producing research as understood by its intended audience).

Again, a research discrepancy is a special case of a research inconsistency.

There are many ways a research project may be inconsistent; there is only one way a research project has a research discrepancy.