Chapter 18: Visualising Uncertainty and Presenting Results

Learning Objectives

By the end of this chapter, you will be able to explain why uncertainty visualisation is central to small-sample reporting, distinguish standard deviations, standard errors and confidence intervals, show individual observations alongside summaries, create clear ggplot2 figures with uncertainty intervals, identify misleading visual choices, and design figures and tables that support estimation rather than binary significance claims.

The Role of Visualisation in Small-Sample Research

Visualisation serves multiple purposes:

  • Exploratory: Identify patterns, outliers, and distributional features during data screening.
  • Diagnostic: Assess assumptions (normality, linearity, homoscedasticity).
  • Inferential: Display estimates, confidence intervals, and group comparisons.
  • Communicative: Convey findings to diverse audiences in accessible formats.

With small samples, visualisation is particularly valuable because individual data points can be shown (unlike large datasets where summaries are necessary). Showing raw data alongside summaries makes the variability and structure of the data visible to the reader.

Visualising Point Estimates with Confidence Intervals

Error bars (standard errors or confidence intervals) convey uncertainty. Use 95% CIs for inferential plots, as they align with conventional significance testing (CIs that exclude zero correspond to p < 0.05).

Best practices: - Label axes clearly with units. - Include a legend if multiple groups are compared. - Use colour or shape to distinguish groups. - Avoid 3D effects and unnecessary decoration (chart junk). - Use colour-blind-safe palettes such as viridis or carefully chosen ColorBrewer palettes. Do not rely on colour alone. Combine colour with labels, shapes, or direct annotation where possible.

Example: Bar Plot with Error Bars

We compare mean satisfaction scores between two campaign types with 95% CI error bars.

Figure 18.1: Mean customer satisfaction by campaign type with 95% confidence intervals.

Interpretation: The bars show mean satisfaction for each campaign, and the error bars show 95% confidence intervals based on the t distribution within each group. These intervals describe uncertainty around each group mean. Formal group comparisons should still be reported with the planned statistical test rather than judged only by overlap of the bars.

Showing Individual Data Points

With small samples (n < 50), individual data points can be overlaid on summary plots. This reveals the distribution, identifies outliers, and shows sample size directly.

Example: Dot Plot with Mean and CI

We create a dot plot showing individual satisfaction scores, overlaid with group means and CIs.

Figure 18.2: Individual satisfaction scores with group means and 95% confidence intervals.

Interpretation: Each dot represents one participant. The diamond shows the group mean, and the error bars show the 95% CI. Readers can see the distribution of individual scores, the central tendency, and the precision of the estimate simultaneously.

Box Plots for Distributional Comparison

Box plots display the median, quartiles, and outliers, providing a non-parametric summary of distribution. They are particularly useful for comparing groups when data are skewed or ordinal.

Example: Box Plot Comparison

We create a box plot comparing satisfaction scores between campaigns.

Figure 18.3: Box plot comparing customer satisfaction across campaign types.

Interpretation: The box shows the interquartile range (IQR) with the median as a line inside. Whiskers extend to 1.5 × IQR, and points beyond are potential outliers. Overlaying individual points shows sample size and exact values. This plot is ideal for nonparametric comparisons, such as the Mann–Whitney U test.

Visualising Regression Results

For regression models, plot predicted values with confidence bands, and overlay observed data. This shows model fit, uncertainty, and deviations.

Example: Scatterplot with Regression Line and CI Band

We fit a linear regression (performance ~ experience) and plot the results.

Figure 18.4: Linear regression of performance on experience with 95% confidence band.

Interpretation: Each point is an observed case. The blue line is the fitted regression line. The shaded band is the 95% confidence interval for the predicted mean at each value of experience. The band widens at the extremes (where data are sparse), reflecting greater uncertainty. This visualisation shows model fit, precision, and individual deviations simultaneously.

Forest Plots for Several Estimates

When a report compares several estimates, a forest plot is often clearer than several separate tables. The plot should show the point estimate, the confidence interval, and a reference line such as zero for mean differences or one for ratios. This format works well for multiple outcomes, subgroup estimates, or sensitivity analyses.

Figure 18.5: Forest plot of four small-sample estimates with 95% confidence intervals.

Interpretation: The forest plot makes precision visible. The knowledge-score interval stays above zero, while the other intervals include zero or values close to it. This does not make the other estimates unimportant. It shows that the small sample leaves more uncertainty about their direction or practical importance.

Raincloud and Half-Eye Plots

Raincloud or half-eye plots combine the raw observations, a distribution summary and an interval display. They are useful when a small sample is large enough to show distributional shape but small enough that individual observations should remain visible. The ggdist package provides a compact implementation. Because it is an optional visualisation package, the example below is shown as a template.

library(ggdist)

ggplot(study_data, aes(x = satisfaction, y = campaign, fill = campaign)) +
  ggdist::stat_halfeye(adjust = 0.8, width = 0.55, alpha = 0.6) +
  geom_jitter(height = 0.08, width = 0, alpha = 0.6, size = 1.8) +
  stat_summary(fun = median, geom = "point", shape = 23, size = 2.6, fill = "white") +
  scale_fill_viridis_d(option = "C", end = 0.8) +
  labs(x = "Satisfaction (1-5 scale)", y = "Campaign") +
  theme_classic(base_size = 11) +
  theme(legend.position = "none")

Presenting Results in Tables

Tables complement figures by providing exact values. For small samples, consider showing:

  • Sample sizes (n per group).
  • Means and standard deviations (or medians and IQRs).
  • Effect sizes and confidence intervals.
  • Test statistics and p-values.

Use simple publication tables with clear labels, sample sizes and notes that explain the inferential method.

For publication tables, align text columns left and numeric columns right, use a consistent number of decimal places within a column, and put method details in notes rather than in crowded column headings. Report exact sample sizes, avoid unnecessary trailing precision, and state whether intervals are exact, bootstrap, model-based or rank-based. If a table mixes estimates with different scales, use separate sections or clear row labels so readers do not compare incompatible numbers.

Example: Results Summary Table

We create a summary table for the campaign comparison.

Table 25.1

Table 18.1

Summary statistics and Mann–Whitney test for campaign satisfaction

campaign N Mean SD Median
Email 15 3.87 0.99 4
Social 15 3.27 0.80 3

Note. Mann–Whitney rank-sum test: p = 0.057.

Interpretation: The table provides exact summary statistics for each group. Readers can see sample sizes, central tendency, and variability. The p-value from the Mann–Whitney test is reported in the subtitle or a footnote. Tables and figures together provide a complete, accessible presentation of results.

Avoiding Misleading Visualisations

Common pitfalls:

  • Suppressed zero on the y-axis: Exaggerates differences. Use a zero baseline unless there is good reason not to (and explain the choice).
  • 3D effects and unnecessary decoration: Distract from data and can obscure values.
  • Dual axes with different scales: Misleading comparisons. Avoid or use with extreme caution.
  • Overplotting without jitter or transparency: Hides overlapping points. Use jitter, transparency, or both.

Key Takeaways

Small-sample figures should show uncertainty and, whenever feasible, the individual observations behind the summary. Point estimates should be paired with confidence intervals, box plots and dot plots should reveal distributional features, and regression figures should show both fitted trends and uncertainty bands. Tables remain necessary because they provide exact sample sizes, estimates, intervals and test results. The central principle is transparency: avoid suppressed axes, 3D effects, dual axes and other design choices that make modest evidence appear stronger than it is.

Self-Assessment Quiz

Question 1. Why is visualisation particularly valuable in small-sample research?

Explanation.

Small samples make it practical to show individual observations alongside summaries. This lets readers see variability, outliers and sample size directly rather than relying only on means or p-values.

Question 2. What do 95% confidence interval error bars represent?

Explanation.

Confidence intervals show uncertainty around an estimate, such as a mean, difference, rate or regression coefficient.

Question 3. What is the advantage of overlaying individual data points on summary plots (means or medians)?

Explanation.

Showing raw data reveals patterns that summary statistics alone can hide, including skewness, ties, clusters and influential observations.

Question 4. In a regression plot with a confidence band, why does the band typically widen at the extremes?

Explanation.

Confidence bands usually widen where there are fewer observations to constrain the model, often near the edges of the predictor range.

Question 5. What information does a box plot display?

Explanation.

A box plot shows the interquartile range, the median and potential outliers. It is a compact non-parametric summary of a distribution.

Question 6. Why should suppressed zero on the y-axis be avoided (or used with caution)?

Explanation.

Starting an axis above zero can make small differences appear larger than they are. If a truncated axis is necessary, the reason should be stated clearly.

Question 7. What is “chart junk”?

Explanation.

Chart junk refers to non-data elements that reduce clarity, such as unnecessary 3D effects, excessive colour or decorative elements.

Question 8. When creating results tables, what essential information should be included?

Explanation.

Complete tables report sample size, central tendency, variability, effect size, interval estimates and the planned test result so readers can assess both statistical and practical importance.

Question 9. What does jittering accomplish in plots with many overlapping points?

Explanation.

Jittering adds small random offsets to points so overlapping observations become visible.

Question 10. In the dot plot example showing satisfaction scores, what does the diamond symbol represent?

Explanation.

The diamond marks the group mean, while the smaller dots show individual observations. Different symbols help distinguish summaries from raw data.