Calculating Pre/Post Correlation from a Paired T-Test

In this post we will go over how to convert a t-statistic from a paired t-test into a pre/post correlation. This is useful when change score standard deviations are unknown. Pre/post correlations are necessary to obtain various types of effect sizes and standard errors such as the repeated measures variant of Cohen’s d, also referred to as Cohen’s drm.
Author

Matthew B. Jané

Published

September 13, 2023

Step 1: Obtain the necessary statistics

In order to calculate the pre/post correlation (r), we need the standard deviation (SD) of pre-test scores (SDpre), the SD of post-test scores (SDpost), the mean change (Mchange), the paired t-statistic (t), the sample size (N). In this blog post, we are assuming that the change score standard deviation (SDchange) is unavailable to us. If the pre-test SD is available, but the post-test SD is unavailable, you can approximate the post-test SD by, first, taking average ratio of the pre-test SD and post-test SD from k studies in the current meta-analysis (SDratio; see the blog post on 9/8/2023), then we can approximate the post-test SD by multiplying the pre-test SD by the average SD ratio,

SDpostSD¯ratio×SDpre

A rougher approximation would be to simply set the pre-test SD and post-test SD to be equal. If the study reports an F-statistic from a one-way repeated measures ANOVA, the F-statistic is equal to the square of the t-statistic,

t=F

Ensure that you apply the correct sign (negative or positive) to the t-statistic, since the F-statistic is always positive.

# Obtain values
M_change <- 3
SD_post <- 11
SD_pre <- 9
t <- 2.50
N <- 50

Step 2: Calculate the Pre/Post Correlation

Lets start by figuring out how to find the change score SD. The paired t-statistic is defined as the mean change score divided by the standard error of change scores, such that, t=MchangeSEchange Since we need the change score SD, we can use the definition of the standard error of the mean to put the t-statistic in terms of SDchange: SEchange=SDchangeN and therefore t can be expressed as,

t=Mchange(SDchangeN)

then we just need to solve for SDchange:

SDchange=Mchange×Nt

Okay so now let us recall the definition of change score SDs from the blog post on 9/8/2023. In that blog we discussed how to obtain the pre/post correlation from the change score SD, now that we have converted t to SDchange, we can solve for the correlation in a similar way. First things first, the change score SD can be defined as, SDchange=SDpre2+SDpost22×r×SDpreSDpost

We can re-arrange this to isolate the pre/post correlation (r),

r=SDpre2+SDpost2SDchange22×SDpre×SDpost

In our case, the study did not report the change score SD, therefore we can replace it with our derived SDchange from a paired t-test:

r=SDpre2+SDpost2(Mchange×Nt)22×SDpre×SDpost Lets neaten this formulation up a tad:

r=t2(SDpre2+SDpost2)N×Mchange22×t2×SDpre×SDpost

Isn’t that just a beautiful thing?? So there you have it! the full equation for the pre/post correlation from a paired t-test! Note that this is a direct conversion and not merely an approximation.

# Calculate pre/post correlation
r <- (t^2*(SD_pre^2 + SD_post^2) - N * M_change^2) / (2*t^2*SD_pre*SD_post)

# Print results
print(paste0('r = ',round(r,3)))
[1] "r = 0.657"

Applying it to a simulated dataset

We can simulate correlated pre/post scores from a bivariate Gaussian with known parameters. The calculated correlation is exactly correct!

# install.packages('MASS')
library(MASS)

# Define parameters
SD_pre <- 9
SD_post <- 11
r_true <- .70
M_pre <- 20
M_post <- 25
N <- 100

# Simulate correlated pre/post scores from bivariate gaussian
data <- mvrnorm(n=N,
               mu=c(M_pre,M_post),
               Sigma = data.frame(x=c(SD_pre^2,r_true*SD_pre*SD_post),
                                  y=c(r_true*SD_pre*SD_post,SD_post^2)),
               empirical = TRUE)

# Obtain simulated scores
x_pre <- data[,1] # Pre-test scores
x_post <- data[,2] # Post-test scores
x_change <- x_post - x_pre # Calculate change scores

# Calculate standard deviations, t-stats, and mean change
SD_pre <- sd(x_pre)
SD_post <- sd(x_post)
t <- mean(x_change) / (sd(x_change)/sqrt(N))
M_change <- mean(x_change)

# Calculate pre/post correlation
r <- (t^2*(SD_pre^2 + SD_post^2) - N * M_change^2) / (2*t^2*SD_pre*SD_post)

# print results
print(paste0('r = ',r))
[1] "r = 0.7"