Chapter 17: Transparent Reporting of Methods and Limitations

Learning Objectives

By the end of this chapter, you will be able to explain why transparent reporting is central to small-sample credibility, document analytic choices in reproducible scripts, distinguish planned from exploratory analyses, report samples, exclusions and missing data clearly, evaluate whether studies disclose enough information to support their claims, and use reporting guidelines such as CONSORT, STROBE and PRISMA as practical checklists rather than as afterthoughts.

The Importance of Transparency

Transparent reporting allows readers to evaluate the quality of evidence, assess the risk of bias, and replicate or build upon findings. With small samples, transparency is particularly important because results are more sensitive to analytic choices, outliers, and missing data. Readers need full information to judge whether conclusions are warranted.

Transparent reporting includes a clear description of sampling and recruitment, a summary of participant characteristics, complete reporting of variables and measures, a record of data cleaning and exclusions, and a justified statement of the statistical methods used. It also requires reporting all planned analyses and relevant sensitivity checks rather than only statistically significant findings. Limitations and plausible alternative explanations should be stated directly so readers can judge how far the evidence supports the conclusion.

Putting the Transparency Pieces Together

Transparency is a workflow, not a paragraph added at the end of a report. The same decisions should appear in four places: the preregistration or planning document, the analysis script, the results section and the limitations section. If those four records disagree, the report should explain why.

Stage What the reader should be able to verify
Planning What was primary, what was exploratory, and what decision rules were set before analysis
Data preparation How exclusions, missing data, recoding and outliers were handled
Analysis script Which tests or models were run, with seeds, packages and sensitivity analyses visible
Results report Estimates, intervals, p-values where relevant, adjusted p-values where needed and clear effect-size language
Limitations How sample size, precision, design, assumptions and generalisability constrain the conclusion

This structure is especially important when the analysis changes after inspection. A change can be defensible, but it must be visible. A small-sample report should never leave readers guessing whether a method was planned, chosen because assumptions failed, or selected because it produced the strongest result.

Documenting Analytic Choices

Modern quantitative research involves many decisions: how to handle outliers, which variables to include, whether to transform variables, which test to use, how to handle missing data. These decisions, if made after seeing the data, can inflate Type I error and bias estimates (researcher degrees of freedom, p-hacking).

When possible, preregister hypotheses, methods and decision rules before data collection or before the dataset is inspected. All analysis decisions should then be documented in a reproducible script, with exploratory analyses and sensitivity checks clearly labelled. Exploratory work is entirely legitimate, but it should be labelled as such and kept separate from confirmatory analyses rather than presented as if planned from the outset.

Example: Documenting Analysis Decisions in Code Comments

A well-documented analysis script includes comments explaining each decision.

       id             campaign   satisfaction      age_group    prior_purchase
 Min.   : 1.00   Length   :30   Min.   :1.00   Length   :30   Length   :30    
 1st Qu.: 8.25   N.unique : 2   1st Qu.:3.00   N.unique : 3   N.unique : 2    
 Median :15.50   N.blank  : 0   Median :4.00   N.blank  : 0   N.blank  : 0    
 Mean   :15.50   Min.nchar: 5   Mean   :3.57   Min.nchar: 3   Min.nchar: 2    
 3rd Qu.:22.75   Max.nchar: 6   3rd Qu.:4.00   Max.nchar: 5   Max.nchar: 3    
 Max.   :30.00                  Max.   :5.00                                  

    Wilcoxon rank sum test with continuity correction

data:  satisfaction by campaign
W = 156, p-value = 0.06
alternative hypothesis: true location shift is not equal to 0

    Two Sample t-test

data:  satisfaction by campaign
t = 1.8, df = 28, p-value = 0.08
alternative hypothesis: true difference in means between group Email and group Social is not equal to 0
95 percent confidence interval:
 -0.07298  1.27298
sample estimates:
 mean in group Email mean in group Social 
               3.867                3.267 

Interpretation: The script documents that satisfaction is treated as ordinal and that a nonparametric test is chosen accordingly. A sensitivity analysis using a t-test (assuming equal intervals) is also reported to show robustness. This transparency helps readers understand and trust the analysis.

Describing the Sample

The sample description should state the target population, the accessible population, the sampling method, inclusion and exclusion criteria, recruitment procedures, response rate, final sample size after exclusions, and relevant participant characteristics such as demographics or baseline measures.

Use a table to summarise sample characteristics. For RCTs, report characteristics separately by group to verify balance.

Example: Sample Characteristics Table

We create a descriptive table for the mini_marketing dataset. The table reports group size, satisfaction scores and prior purchase rates so that readers can assess baseline comparability.

Table 17.1

Sample characteristics by campaign type

Campaign n Satisfaction M Satisfaction SD Prior purchase (%)
Email 15 3.87 0.99 53.3
Social 15 3.27 0.80 53.3

Note. The table summarises the mini marketing study by campaign group. Prior purchase is reported as the percentage of participants with a previous purchase.

Interpretation: The table shows sample size, satisfaction scores, and prior purchase rates for each campaign group. Readers can assess whether groups are comparable at baseline. If the study were an RCT, imbalances might suggest randomisation problems or chance variation. In observational studies, imbalances indicate potential confounding.

Reporting Missing Data

Missing-data reporting should state the number of complete observations, the number and proportion missing for each variable, visible patterns of missingness and the method used to handle missing values. If missingness clusters in certain subgroups, that pattern should be described because it may affect interpretation.

If multiple imputation was used, state the number of imputations and the imputation method.

Reporting Deviations from Planned Analyses

If the analysis plan changes after seeing the data (e.g., adding a covariate, using a different test, excluding outliers), report the deviation explicitly.

Example: “We initially planned to use a t-test but observed severe skewness in the outcome. We therefore used a Mann–Whitney U test instead. Results from both tests are reported in the supplementary materials.”

Acknowledging Limitations

Every study has limitations. In small-sample studies, the common ones are limited power, wide confidence intervals, sensitivity to outliers or assumption violations, limited generalisability from narrow or non-probability samples, and inflated false-positive risk when many tests are conducted. These limitations should be connected to the interpretation: explain how they might affect the conclusion and what a future study would need to resolve.

Handling Multiple Comparisons in Small Samples

When conducting multiple statistical tests, the probability of Type I error increases. With \(k\) independent tests at \(\alpha = 0.05\): - Family-wise error rate (FWER) \(\approx 1 - (1 - \alpha)^k\) - For 5 tests: roughly 23% chance of at least one false positive - For 10 tests: roughly 40% chance

When to Correct

  • Multiple outcomes or subgroups
  • Post-hoc pairwise comparisons
  • Exploratory analyses with many variables

Common Methods

  1. Bonferroni: \(\alpha_\text{adjusted} = \alpha / k\) (most conservative)
  2. Holm–Bonferroni: Sequential step-down procedure
  3. Benjamini–Hochberg (FDR): Controls the false discovery rate

Table 17.2

Adjusted p-values for five exploratory tests

Test Raw p Bonferroni Holm Benjamini–Hochberg FDR
Test 1 0.010 0.050 0.050 0.050
Test 2 0.030 0.150 0.120 0.075
Test 3 0.080 0.400 0.240 0.133
Test 4 0.150 0.750 0.300 0.188
Test 5 0.250 1.000 0.300 0.250

Note. Bonferroni controls the family-wise error rate most conservatively; Holm is a step-down family-wise method; Benjamini–Hochberg controls the false discovery rate.

Reporting Template

“We tested effects in three subgroups. After Holm–Bonferroni correction, only Group A showed a significant difference (adjusted p = 0.03).”

Small Sample Considerations

With limited power, strict corrections can remove all nominally significant findings. The practical response is to pre-specify primary outcomes, label exploratory outcomes clearly, report both corrected and uncorrected p-values where informative, and place greater weight on effect sizes and confidence intervals.

Key Takeaways

For multiple-comparison reporting, state how many tests were conducted, describe the correction method, and distinguish confirmatory from exploratory analyses. In small-sample work, adjusted p-values should usually be interpreted alongside confidence intervals because the interval shows the direction and precision of the estimate.

Pre-Registration for Small-Sample Studies

Pre-registration involves documenting your hypotheses, methods, and analysis plan before data collection begins or, at the latest, before the dataset is inspected. This is especially important for small samples because:

  • Limited power increases temptation for p-hacking
  • Results are more sensitive to analytic choices
  • Multiple testing is common (searching for effects)
  • Post-hoc storytelling is easier with small samples

What to Pre-Register

Minimum requirements: 1. Research questions and hypotheses (primary versus secondary) 2. Sample size with justification 3. Statistical tests planned for each hypothesis 4. Handling of outliers, missing data, and covariates 5. Multiple comparison corrections (if applicable)

Pre-Registration Template

Use the template below as a planning table rather than as code to run. A preregistration should be specific enough that another analyst could reproduce the planned analysis without asking what you meant.

Section What to write before analysis
Study title Short descriptive title and date of preregistration
Primary question One confirmatory question stated in testable terms
Secondary questions Exploratory or supportive questions labelled as secondary
Hypotheses Directional or non-directional predictions, including the expected outcome metric
Design and sample Target sample size, stopping rule, recruitment source, inclusion and exclusion criteria
Variables Primary outcome, predictors, covariates and scoring rules
Primary analysis Statistical test or model, alpha level, effect size, confidence interval and software
Assumption checks Planned diagnostics and what will be done if assumptions are not met
Outliers and missing data Definitions, handling rules and sensitivity analyses
Multiplicity Which outcomes are primary, which are exploratory and how p-values will be adjusted
Decision rule What pattern of estimate, interval and p-value will be interpreted as support for the primary hypothesis

Where to Pre-Register

The Open Science Framework (osf.io) provides free, time-stamped registration for study protocols, analysis plans and materials. AsPredicted (aspredicted.org) provides a short nine-question template that is widely used for behavioural, psychology and management studies. Registered Reports are a journal submission format in which the research question and methods are reviewed before results are known, with in-principle acceptance if the protocol is judged sound.

Handling Deviations

Deviations are acceptable if reported transparently:

**Deviations from Pre-Registration:**
1. Sample size: Planned n = 40, achieved n = 36 due to [reason]
2. Primary test: Switched from t-test to Mann–Whitney due to severe skewness (skew = 2.4)
3. Additional analysis: Added baseline covariate per reviewer request (post-hoc, clearly labelled)

Benefits for Small Samples

  • Protects against p-hacking accusations
  • Separates confirmatory from exploratory analyses
  • Improves study design through upfront planning
  • Facilitates transparent reporting
TipPre-Registration Checklist

Following Reporting Guidelines

Numerous reporting guidelines exist for different study designs:

These guidelines provide checklists of items to report. Following them improves transparency and comparability across studies. Even if formal adherence is not required, consult the relevant guideline as a checklist.

For example, CONSORT asks randomised trials to report participant flow. In a small pilot RCT, this can be as simple as a table that states how many participants were assessed, randomised, analysed and excluded at each stage. If 40 people were screened, 30 were enrolled, and 28 were analysed, the report should make clear where the two losses occurred and whether they were related to group assignment or outcome.

Table 17.3

Example participant-flow summary for a small pilot RCT

Stage n Note
Assessed for eligibility 40 10 did not meet inclusion criteria or declined
Randomised 30 1:1 allocation
Allocated to intervention 15 14 analysed; 1 withdrew before post-test
Allocated to control 15 14 analysed; 1 missing post-test
Included in analysis 28 Primary analysis used available paired outcomes
Excluded after randomisation 2 Reasons reported by group

Note. This table illustrates the reporting logic of CONSORT Item 13a. A full trial report would usually include a flow diagram as well.


Key Takeaways

Transparent reporting allows readers to evaluate the quality and limits of small-sample evidence. The essential tasks are to document analytic choices in reproducible scripts, report sample characteristics, missing data and exclusions explicitly, disclose deviations from planned analyses, and present sensitivity analyses where decisions could affect the result. Relevant reporting guidelines such as CONSORT and STROBE should be used as checklists, while the limitations section should state clearly how sample size, precision, assumptions and generalisability affect the conclusion.


Self-Assessment Quiz

Test your understanding of transparent reporting from Chapter 17.

Question 1. Which should be reported when documenting a small-sample study?

Explanation.

Transparent reporting requires documenting planned analyses, exploratory analyses and sensitivity checks, not just significant findings. Selective reporting inflates Type I error across the literature and prevents readers from evaluating the quality of the evidence.

Question 2. A study planned to use a t-test but switched to Mann–Whitney after seeing skewed data. How should this be reported?

Explanation.

Deviations from plans should be documented with justification. Reporting both the planned and adapted analyses shows how much the conclusion depends on the analytic choice.

Question 3. What is “p-hacking”?

Explanation.

P-hacking involves exploring many analyses, such as different covariates, subgroups or outlier rules, until statistical significance appears and then selectively reporting that analysis. This inflates the false-positive rate.

Question 4. Pre-registration helps prevent:

Explanation.

Pre-registration documents hypotheses and analysis plans before the data are inspected. This reduces post-hoc decisions that capitalise on chance and inflate Type I error.

Question 5. A study with n=15 per group finds p=0.12. The limitation section should state:

Explanation.

Small samples have limited power. Non-significance may reflect insufficient power rather than absence of effect, so the limitation section should discuss precision, minimum detectable effects and the uncertainty around the estimate.