Calculating Pre/Post Correlation from a Paired T-Test

In this post we will go over how to convert a t-statistic from a paired t-test into a pre/post correlation. This is useful when change score standard deviations are unknown. Pre/post correlations are necessary to obtain various types of effect sizes and standard errors such as the repeated measures variant of Cohen’s

d

, also referred to as Cohen’s

d_{r m}

Author

Matthew B. Jané

Published

September 13, 2023

Step 1: Obtain the necessary statistics

In order to calculate the pre/post correlation ( $r$ ), we need the standard deviation (SD) of pre-test scores ( $S D_{p r e}$ ), the SD of post-test scores ( $S D_{p o s t}$ ), the mean change ( $M_{c h a n g e}$ ), the paired t-statistic ( $t$ ), the sample size ( $N$ ). In this blog post, we are assuming that the change score standard deviation ( $S D_{c h a n g e}$ ) is unavailable to us. If the pre-test SD is available, but the post-test SD is unavailable, you can approximate the post-test SD by, first, taking average ratio of the pre-test SD and post-test SD from $k$ studies in the current meta-analysis ( ${\overset{―}{S D}}_{r a t i o}$ ; see the blog post on 9/8/2023), then we can approximate the post-test SD by multiplying the pre-test SD by the average SD ratio,

$S D_{p o s t} \approx {\bar{S D}}_{r a t i o} \times S D_{p r e}$

A rougher approximation would be to simply set the pre-test SD and post-test SD to be equal. If the study reports an F-statistic from a one-way repeated measures ANOVA, the F-statistic is equal to the square of the t-statistic,

$t = \sqrt{F}$

Ensure that you apply the correct sign (negative or positive) to the t-statistic, since the F-statistic is always positive.

# Obtain values
M_change <- 3
SD_post <- 11
SD_pre <- 9
t <- 2.50
N <- 50

Step 2: Calculate the Pre/Post Correlation

Lets start by figuring out how to find the change score SD. The paired t-statistic is defined as the mean change score divided by the standard error of change scores, such that, $t = \frac{M_{c h a n g e}}{S E_{c h a n g e}}$ Since we need the change score SD, we can use the definition of the standard error of the mean to put the t-statistic in terms of $S D_{c h a n g e}$ : $S E_{c h a n g e} = \frac{S D_{c h a n g e}}{\sqrt{N}}$ and therefore $t$ can be expressed as,

$t = \frac{M_{c h a n g e}}{(\frac{S D_{c h a n g e}}{\sqrt{N}})}$

then we just need to solve for $S D_{c h a n g e}$ :

$S D_{c h a n g e} = \frac{M_{c h a n g e} \times \sqrt{N}}{t}$

Okay so now let us recall the definition of change score SDs from the blog post on 9/8/2023. In that blog we discussed how to obtain the pre/post correlation from the change score SD, now that we have converted $t$ to $S D_{c h a n g e}$ , we can solve for the correlation in a similar way. First things first, the change score SD can be defined as, $S D_{c h a n g e} = \sqrt{S D_{p r e}^{2} + S D_{p o s t}^{2} - 2 \times r \times S D_{p r e} S D_{p o s t}}$

We can re-arrange this to isolate the pre/post correlation ( $r$ ),

$r = \frac{S D_{p r e}^{2} + S D_{p o s t}^{2} - S D_{c h a n g e}^{2}}{2 \times S D_{p r e} \times S D_{p o s t}}$

In our case, the study did not report the change score SD, therefore we can replace it with our derived $S D_{c h a n g e}$ from a paired t-test:

$r = \frac{S D_{p r e}^{2} + S D_{p o s t}^{2} - {(\frac{M_{c h a n g e} \times \sqrt{N}}{t})}^{2}}{2 \times S D_{p r e} \times S D_{p o s t}}$ Lets neaten this formulation up a tad:

$r = \frac{t^{2} (S D_{p r e}^{2} + S D_{p o s t}^{2}) - N \times M_{c h a n g e}^{2}}{2 \times t^{2} \times S D_{p r e} \times S D_{p o s t}}$

Isn’t that just a beautiful thing?? So there you have it! the full equation for the pre/post correlation from a paired t-test! Note that this is a direct conversion and not merely an approximation.

# Calculate pre/post correlation
r <- (t^2*(SD_pre^2 + SD_post^2) - N * M_change^2) / (2*t^2*SD_pre*SD_post)

# Print results
print(paste0('r = ',round(r,3)))

[1] "r = 0.657"

Applying it to a simulated dataset

We can simulate correlated pre/post scores from a bivariate Gaussian with known parameters. The calculated correlation is exactly correct!

# install.packages('MASS')
library(MASS)

# Define parameters
SD_pre <- 9
SD_post <- 11
r_true <- .70
M_pre <- 20
M_post <- 25
N <- 100

# Simulate correlated pre/post scores from bivariate gaussian
data <- mvrnorm(n=N,
               mu=c(M_pre,M_post),
               Sigma = data.frame(x=c(SD_pre^2,r_true*SD_pre*SD_post),
                                  y=c(r_true*SD_pre*SD_post,SD_post^2)),
               empirical = TRUE)

# Obtain simulated scores
x_pre <- data[,1] # Pre-test scores
x_post <- data[,2] # Post-test scores
x_change <- x_post - x_pre # Calculate change scores

# Calculate standard deviations, t-stats, and mean change
SD_pre <- sd(x_pre)
SD_post <- sd(x_post)
t <- mean(x_change) / (sd(x_change)/sqrt(N))
M_change <- mean(x_change)

# Calculate pre/post correlation
r <- (t^2*(SD_pre^2 + SD_post^2) - N * M_change^2) / (2*t^2*SD_pre*SD_post)

# print results
print(paste0('r = ',r))

[1] "r = 0.7"