Monday, February 21, 2011

Does Spending on Education Improve SAT scores?

In case one has been living under a rock recently (as I seemingly have given my lack of posting), the big teachers' union battle is occurring on a daily basis in Wisconsin. While the Republican controlled government wants to increase teacher contribution to their benefits package, the unions and many teachers are fighting hard against this.

While for years, we in the private sector have had to contribute to our retirements and benefits packages, this seems novel to many of these teachers. In their defense, though, many independent sources argue that spending more money on education would improve American performance, which is very clearly an issue. However, does the data support this?

My interest was piqued upon coming across a post on this blog, which highlights state spend, and state SAT scores. While the source does show SAT scores and spending, he clearly underestimates the effects of participation rate (which can lead to sample bias) on the individual states. Even the biggest defender of educational spending cuts would not argue Kentucky has a better system than Massachusetts or Vermont.

So I took this to the task of analyzing based on each individual score, and the aggregate total of all three, as dependent to the amount of spend and participation rate of all states. In order to do this, I created this spreadsheet.

First I looked at the subject dear to my heart (and one we should be focusing more on in this country), mathematics. First, I looked at a simple linear model:

> lmMath<-lm(math ~ partrate + spend) > summary(lmMath)

Call:
lm(formula = math ~ partrate + spend)

Residuals:
Min 1Q Median 3Q Max
-64.453 -15.305 -2.256 13.458 39.785

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.565e+02 1.583e+01 35.15 <>


At first glance, math seems (minimally) correlated w/ educational spending, but far more correlated with participation rate. To find potentially better variables, we can analyze a function w/ quadratic versions of participation rate and spend, and then natural logarithmic transformations:

> summary(lm(math ~ partrate + partrate_sq + spend + spend_sq))

Call:
lm(formula = math ~ partrate + partrate_sq + spend + spend_sq)

Residuals:
Min 1Q Median 3Q Max
-57.029 -8.593 -1.308 12.901 30.822

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.238e+02 6.494e+01 8.065 2.37e-10 ***
partrate -2.343e+02 4.543e+01 -5.158 5.16e-06 ***
partrate_sq 1.454e+02 5.662e+01 2.569 0.0135 *
spend 1.286e-02 1.235e-02 1.041 0.3031
spend_sq -5.762e-07 5.865e-07 -0.982 0.3310
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 19.83 on 46 degrees of freedom
Multiple R-squared: 0.7871, Adjusted R-squared: 0.7686
F-statistic: 42.53 on 4 and 46 DF, p-value: 6.712e-15

> summary(lm(math ~ partrate + partrate_ln + spend + spend_ln))

Call:
lm(formula = math ~ partrate + partrate_ln + spend + spend_ln)

Residuals:
Min 1Q Median 3Q Max
-49.374 -8.838 -0.935 13.856 34.057

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -463.83562 909.66724 -0.510 0.61256
partrate -5.60859 32.95449 -0.170 0.86561
partrate_ln -29.31128 8.09637 -3.620 0.00073 ***
spend -0.01099 0.01116 -0.985 0.32971
spend_ln 116.34990 111.24720 1.046 0.30109
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 18.67 on 46 degrees of freedom
Multiple R-squared: 0.8114, Adjusted R-squared: 0.795
F-statistic: 49.47 on 4 and 46 DF, p-value: 4.278e-16

> summary(lm(math ~ partrate_sq + spend_sq))

Call:
lm(formula = math ~ partrate_sq + spend_sq)

Residuals:
Min 1Q Median 3Q Max
-68.990 -18.437 -2.103 19.048 48.723

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.536e+02 9.767e+00 56.687 <> summary(lm(math ~ partrate_ln + spend_ln))

Call:
lm(formula = math ~ partrate_ln + spend_ln)

Residuals:
Min 1Q Median 3Q Max
-48.2752 -9.7688 -0.1039 12.6402 35.6967

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 428.039 127.745 3.351 0.00158 **
partrate_ln -31.017 2.218 -13.984 <>


It seems like all but spend_ln had some correlation in one of the models, so let's throw all 6 variables into one and see what happens:

> summary(lm(math ~ partrate + partrate_ln + partrate_sq + spend + spend_ln + spend_sq))

Call:
lm(formula = math ~ partrate + partrate_ln + partrate_sq + spend +
spend_ln + spend_sq)

Residuals:
Min 1Q Median 3Q Max
-47.376 -9.845 -1.071 11.698 33.868

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -4.978e+03 4.881e+03 -1.020 0.3133
partrate 2.468e+01 1.288e+02 0.192 0.8489
partrate_ln -3.316e+01 1.471e+01 -2.254 0.0292 *
partrate_sq -1.863e+01 9.837e+01 -0.189 0.8507
spend -1.312e-01 1.287e-01 -1.020 0.3134
spend_ln 7.033e+02 6.358e+02 1.106 0.2746
spend_sq 2.974e-06 3.164e-06 0.940 0.3524
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 18.87 on 44 degrees of freedom
Multiple R-squared: 0.8158, Adjusted R-squared: 0.7907
F-statistic: 32.48 on 6 and 44 DF, p-value: 1.287e-14


Yikes, I muddied the waters. Let's look at just the log of participation rate:
> summary(lm(math ~ partrate_ln))

Call:
lm(formula = math ~ partrate_ln)

Residuals:
Min 1Q Median 3Q Max
-47.6408 -11.2827 -0.2747 13.3237 35.7253

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 491.382 4.198 117.06 <2e-16>


This simple model fits the best of any we have seen. Is there any way to incorporate spend into this model?

No matter what, I couldn't find a way to wedge spend into SAT Math scores. The amount of spending on education seemed to have minimal effect, if any, on performance, and one can estimate the state's average SAT score rather well by the formula:

SAT_math = 491.382 - 30.795*ln(participation_rate).

But does this hold true for Reading and Writing?
The good news for the add spending crowd is that spend was a positively relevant variable when factored with the untransformed participation rate variable, as well as when the log transformed version of spend is regressed with the square of participation rate on score.

The bad news is, that the log transformation of participation rate indicates a much stronger correlation than either of these previous functions, with an adjusted r-squared of 87.02%.

SAT_reading = 485.670 - 31.665*ln(participation_rate).

Writing shares the same characteristic, the possibility of spending improving test scores, though the sample bias in the state's scores makes this tough to decipher.

SAT_writing = 475.065 - 30.059*ln(participation_rate).

For total SAT scores, this still holds true, where the log of participation rate trumps all other variables in terms of statistical significance, and takes the form:

SAT_total = 1452.116 - 92.519*ln(participation_rate).

For full disclosure, though, leaving participation_rate untransformed does provide some evidence that spending improves SAT scores:

> summary(lm(total ~ partrate + spend_ln))

Call:
lm(formula = total ~ partrate + spend_ln)

Residuals:
Min 1Q Median 3Q Max
-167.0151 -37.5985 0.6148 32.2768 111.0931

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 825.01 393.85 2.095 0.0415 *
partrate -367.49 27.34 -13.442 <2e-16 ***
spend_ln 98.63 43.20 2.283 0.0269 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 55.8 on 48 degrees of freedom
Multiple R-squared: 0.7943, Adjusted R-squared: 0.7858
F-statistic: 92.7 on 2 and 48 DF, p-value: < 2.2e-16


Suggesting a relationship of SAT_total = 825.01 - 367.49*participation_rate + 98.63*ln(spend). Given this, we can create a ranking of states using SAT participation rate and per pupil spending to derive a ranking list of the best performing states, as shown below.



So, it seems like both sides can lay claims to evidence that back their opinions. Readers, what do you think?