Project 4. Evaluating a Reading Intervention in Small Classrooms (Education)

Background

A school piloted a reading intervention with 22 students across two grades and two teachers. Each student completed a pre-test and post-test. The paired design is appropriate because each student serves as their own baseline, but the classroom context still limits generalisability.

Descriptive Summary

Table P4.1

Reading-score summary for the classroom pilot

Quantity Value
Students 22
Mean pre-test score 65.6
Mean post-test score 73.1
Mean improvement 7.6
Median improvement 8.0
SD of improvement 3.7

Note. Improvement is post-test minus pre-test; positive values favour the intervention period.

Figure P4.1: Student reading scores before and after the classroom intervention.

Most student trajectories increase from pre-test to post-test. The connected-line figure is useful because it shows both the average gain and the variability in individual responses.

Primary Analysis

Table P4.2

Paired reading-score analysis

Analysis Estimate 95% CI p-value
Paired t-test Mean improvement = 7.56 5.92 to 9.20 <0.001
Wilcoxon signed-rank Pseudomedian improvement = 7.80 5.90 to 9.40 <0.001
Shapiro-Wilk on improvements W = 0.96 -- 0.432
Paired standardised mean change dz = 2.04 Not computed --

Note. The signed-rank test is included as a sensitivity check. The Shapiro-Wilk test is reported descriptively because normality tests have limited power in small samples.

The estimated mean improvement is about 7.6 points, and both the t-test and signed-rank sensitivity analysis support a positive gain. The paired standardised change is large, but it should be interpreted as pilot evidence from a specific school context rather than as a stable population effect.

Distribution and Classroom Context

Figure P4.2: Distribution of individual reading-score improvements.

Table P4.3

Descriptive improvements by grade and teacher

Grade Teacher n Mean improvement Median
3rd Mr. Lee 4 8.93 10.5
3rd Ms. Johnson 2 9.85 9.9
4th Mr. Lee 7 6.41 8.3
4th Ms. Johnson 9 7.34 6.9

Note. These cells are too small for formal teacher or grade comparisons; they are included to support contextual interpretation.

The classroom summaries help readers judge whether the overall gain appears concentrated in one grade or teacher. Because the cells are small and not independently randomised, the table should be treated as implementation context rather than evidence of differential effectiveness.

Reporting Summary

Reading scores increased from pre-test to post-test by a mean of 7.56 points, 95% CI [5.92, 9.20], p < 0.001. The result is promising for the participating classrooms. A stronger study would include a comparison group, prespecified fidelity checks, and a larger sample of classrooms.

Extension Task

Create a forest plot of mean improvement by grade or teacher using the descriptive summaries in Table P4.3. Add a note explaining why the plot is useful for implementation review but not sufficient for formal subgroup inference.