Features

Statistical Checks

ReviewerZero's statistical analysis module helps identify inconsistencies and errors in reported statistics. Based on our internal analysis of millions of articles, approximately 1 in 3 papers with statistical claims contains at least one detectable statistical inconsistency.

Why Statistical Checks Matter

Picture this: A groundbreaking medical study claims a new drug reduces blood pressure by 30%. The result hinges on a single sentence: "statistically significant, t(38) = 2.21, p = 0.0332". Journals accept it. Headlines announce it. High-risk patients get their hopes up.

But what if the statistics are wrong?

Studies show that 1 in 5 psychologists admitted to the questionable research practice of "rounding down" a p-value to make it appear smaller. Our automated system catches these issues before they become problems.

Contextual Analysis

Review statistical analyses directly within the manuscript context. Detailed explanations help reviewers understand and evaluate potential issues with confidence.

Contextual Statistical AnalysisContextual Statistical Analysis

In-Context Features

  • Click on any statistical claim to see analysis
  • View explanations alongside original text
  • Navigate between flagged issues easily
  • Access recalculated values directly

False Positive Prevention

Our system intelligently identifies and filters out false positives caused by rounding differences.

False Positive PreventionFalse Positive Prevention

How We Reduce False Positives

  • Rounding Pattern Detection - Recognize common rounding conventions
  • Tolerance Thresholds - Allow for minor numerical variations
  • Context Awareness - Consider how values are typically reported
  • Multiple Verification - Cross-check against different calculation methods

Rounding Considerations

ReportedComputedStatus
p = 0.02p = 0.021Acceptable rounding
p < 0.001p = 0.00089Acceptable inequality
p = 0.05p = 0.06Potential decision error
p < 0.05p = 0.06Significant discrepancy

Statistical Overview

Quick overview of statistical findings with clear explanations and actionable insights.

Statistical OverviewStatistical Overview

Dashboard Features

  • Summary Counts - How many tests found vs. flagged
  • Severity Distribution - Critical, warning, info levels
  • Test Types - Breakdown by statistical test
  • Quick Navigation - Jump to specific issues

Understanding P-Values

The p-value is the final step in a logical statistical process. To understand its power and fragility, you need to see how it's made.

The Null Hypothesis (H₀)

Science begins with a default, skeptical stance. For example: "The drug has no effect. Any difference we see is just random chance." The goal is to gather enough evidence to reject this skeptical view.

The Test Statistic

Researchers quantify their evidence with a test statistic—a single number summarizing how far their data deviates from the null hypothesis. Common examples:

TestStatisticUse Case
t-testtComparing two means
F-testFANOVA, comparing variances
Chi-squareχ²Categorical data
z-testzLarge samples, known variance

Degrees of Freedom

The test statistic's significance depends on degrees of freedom, which relate to sample size. A larger sample leads to a narrower, more "confident" distribution.

The P-Value Calculation

The p-value is the probability of observing a test statistic as extreme as (or more extreme than) the one found, assuming the null hypothesis is true.

If p = 0.0332, it means a result this strong would only happen 3.32% of the time if the null hypothesis were true.

Types of Checks

P-Value Verification

We verify that reported p-values are mathematically consistent with:

  • Test statistics (t, F, chi-square, etc.)
  • Degrees of freedom
  • Sample sizes

Effect Size Validation

Checking that effect sizes align with:

  • Reported means and standard deviations
  • Sample sizes
  • Statistical test results

Decimal Consistency

Detection of:

  • Inconsistent decimal places
  • Rounding errors
  • Mathematically impossible values

Common Statistical Inconsistencies

The Vanishing Details Problem

The most common issue isn't calculation errors—it's missing information:

  • P-values without test statistics
  • Test statistics without degrees of freedom
  • Conclusions without sample sizes
  • Results buried in figures without extractable numbers

P-Hacking

Selective reporting or manipulation to achieve p < 0.05:

  • Trying different statistical tests until one "works"
  • Selectively reporting only significant results
  • Adjusting analysis until threshold is crossed

Calculation Cascade

Sometimes numbers are all there, but they don't add up:

  • P-values that don't match reported test statistics
  • Degrees of freedom that don't align with sample sizes
  • Test statistics that are mathematically impossible

Multiple Testing Mirage

Running many tests without correction:

  • With α = 0.05, 1 in 20 tests is "significant" by chance
  • Reporting only significant findings creates false positives
  • Missing Bonferroni or other corrections

How It Works

Step 1: Text Extraction

Statistical values are extracted from the manuscript using AI:

  • Parses text, tables, and figure captions
  • Recognizes APA-style and other formats
  • Handles non-standard reporting styles

Step 2: Pattern Recognition

Statistics are identified using ML-based pattern matching:

  • Detects test statistics (t, F, χ², r, etc.)
  • Identifies p-values in various formats
  • Extracts degrees of freedom and sample sizes

Step 3: Validation

Each statistic is checked for internal consistency:

  • Recalculate p-values from test statistics
  • Verify mathematical relationships
  • Check for impossible values

Step 4: Cross-Referencing

Related statistics are verified against each other:

  • Means vs. standard deviations
  • Sample sizes vs. degrees of freedom
  • Effect sizes vs. test statistics

Understanding Reports

Issue Categories

CategoryDescriptionExample
InconsistentValues don't match mathematicallyt(24)=2.1 yielding p=0.001
ImpossibleValues outside valid rangesp=-0.05 or r=1.5
SuspiciousUnusual patternsAll p-values exactly 0.05
IncompleteMissing required valuesF statistic without df

Confidence Levels

Each flagged issue includes a confidence indicator:

LevelMeaningRecommendation
HighClear mathematical inconsistencyRequires correction
MediumLikely error, needs verificationInvestigate further
LowPossible issue, may be reporting artifactReview context

Where We Find Statistics

Our system extracts and analyzes statistics from multiple locations throughout your manuscript:

Main Text

Statistics embedded in paragraphs and sentences are automatically detected:

  • Inline statistical reports (e.g., "t(24) = 2.45, p = 0.02")
  • Results sections with multiple test outcomes
  • Discussion sections referencing statistical findings

Tables

Statistical values in tables are extracted and validated:

  • Results tables with test statistics and p-values
  • Summary statistics (means, standard deviations, sample sizes)
  • Multi-column layouts with related values

Figures

Statistics in figure captions and annotations are also analyzed:

  • Caption text containing statistical claims
  • Annotations within figures
  • Legend text with statistical information

Broader Format Support

Our system handles a wide variety of statistical reporting formats beyond standard APA style:

  • Non-standard formats - Recognizes statistics reported in various styles
  • Multiple notations - Handles different ways of expressing the same test
  • Incomplete reporting - Detects when required statistical information is missing

Example Output

Minor Rounding Difference

Reported: t(28) = 2.45, p = 0.02
Computed: t(28) = 2.45, p = 0.021

Status: Minor rounding difference - acceptable
Confidence: Low

Significant Discrepancy

Reported: F(2, 45) = 3.21, p < 0.001
Computed: F(2, 45) = 3.21, p = 0.049

Status: Significant discrepancy - requires review
Confidence: High
Decision Error: Yes (would change significance conclusion)

Best Practices

Before Submission

  1. Run statistical checks on your final draft
  2. Address all high-confidence flags
  3. Verify degrees of freedom match sample sizes
  4. Report exact p-values when possible

For Reviewers

  1. Focus on decision-altering errors first
  2. Consider rounding conventions in the field
  3. Check for missing statistical information
  4. Request clarification for incomplete reports

Reporting Standards

To ensure statistical accuracy:

  1. Use statistical software to generate values
  2. Double-check all manually transcribed statistics
  3. Report exact p-values rather than inequalities
  4. Include degrees of freedom for all tests
  5. Provide sample sizes for each analysis