Scale Back Linear Regression Coefficients in R from Scaled and Centered Data

Scale back linear regression coefficients in R from scaled and centered data

If I understand your description (that is unfortunately at the moment code-free), you are getting standardized regression coefficients for Y ~ As + Bs*Xs where all those "s" items are scaled variables. The coefficients then are the predicted change on a std deviation scale of Y associated with a change in X of one standard deviation of X. The scale function would have recorded the means and standard deviations in attributes for hte scaled object. If not, then you will have those estimates somewhere in your console log. The estimated change in dY for a change dX in X should be: dY*(1/sdY) = Bs*dX*(1/sdX). Predictions should be something along these lines:

Yest = As*(sdX) + Xmn + Bs*(Xs)*(sdX)

You probably should not have needed to standardize the Y values, and I'm hoping that you didn't because it makes dealing with the adjustment for the means of the X's easier. Put some code and example data in if you want implemented and checked answers. I think @DanielGerlance is correct in saying to multiply rather than divide by the SD's.

Back-transform coefficients from glmer with scaled independent variables for prediction

The problem with your approach is that it only "unscales" based on the wt variable, whereas you scaled all of the variables in your regression model. One approach that works is to adjust all of the variables in your new (prediction) data frame using the centering/scaling values that were used on the original data frame:

## scale variable x using center/scale attributes
## of variable y
scfun <- function(x,y) {
    scale(x, 
          center=attr(y,"scaled:center"), 
          scale=attr(y,"scaled:scale"))
}
## scale prediction frame
xweight_sc <- transform(xweight,
                        wt = scfun(wt, database$wt),
                        am = scfun(am, database$am))
## predict
p_unsc <- predict(model.1, 
                  newdata=xweight_sc, 
                  type="response", re.form=NA)

Comparing this p_unsc to your predictions from the unscaled model (b in your code), i.e. all.equal(b,p_unsc), gives TRUE.

Another reasonable approach would be to

unscale/uncenter all of your parameters using the "unscaling" approaches presented in one of the linked question (such as this one), generating a coefficient vector beta_unsc
construct the appropriate model matrix from your prediction frame:

X <- model.matrix(formula(model,fixed.only=TRUE), 
         newdata=pred_frame)

compute the linear predictor and back-transform:

pred <- plogis(X %*% beta_unsc)

How to unscale the coefficients from an lmer()-model fitted with a scaled response

Updated: generalized to allow for scaling of the response as well as the predictors.

Here's a fairly crude implementation.

If our original (unscaled) regression is

Y = b0 + b1*x1 + b2*x2 ...

Then our scaled regression is

(Y0-mu0)/s0 = b0' + (b1'*(1/s1*(x1-mu1))) + b2'*(1/s2*(x2-mu2))+ ...

This is equivalent to

Y0 = mu0 + s0((b0'-b1'/s1*mu1-b2'/s2*mu2 + ...) + b1'/s1*x1 + b2'/s2*x2 + ...)

So bi = s0*bi'/si for i>0 and

b0 = s0*b0'+mu0-sum(bi*mui)

Implement this:

 rescale.coefs <- function(beta,mu,sigma) {
    beta2 <- beta ## inherit names etc.
    beta2[-1] <- sigma[1]*beta[-1]/sigma[-1]
    beta2[1]  <- sigma[1]*beta[1]+mu[1]-sum(beta2[-1]*mu[-1])
    beta2
 }

Try it out for a linear model:

m1 <- lm(Illiteracy~.,as.data.frame(state.x77))
b1 <- coef(m1)

Make a scaled version of the data:

ss <- scale(state.x77)

Scaled coefficients:

m1S <- update(m1,data=as.data.frame(ss))
b1S <- coef(m1S)

Now try out rescaling:

icol <- which(colnames(state.x77)=="Illiteracy")
p.order <- c(icol,(1:ncol(state.x77))[-icol])
m <- colMeans(state.x77)[p.order]
s <- apply(state.x77,2,sd)[p.order]
all.equal(b1,rescale.coefs(b1S,m,s))  ## TRUE

This assumes that both the response and the predictors are scaled.

If you scale only the response and not the predictors, then you should submit (c(mean(response),rep(0,...)) for m and c(sd(response),rep(1,...)) for s (i.e., m and s are the values by which the variables were shifted and scaled).
If you scale only the predictors and not the response, then submit c(0,mean(predictors)) for m and c(1,sd(predictors)) for s.

Unscale coefficient of scaled continuous variable in negative binomial regression

Set up data:

set.seed(1)
dep <- dnbinom(seq(1:150), size = 150, prob = 0.75)
ind.1 <- ifelse(sign(rnorm(150))==-1,0,1)
ind.2 <- rnorm(150, 10, 1.7)
df <- data.frame(dep, ind.1, ind.2)
sc <- sd(df$ind.2)

Fit unscaled and scaled models:

m_unsc <- MASS::glm.nb(dep ~ ind.1 + ind.2, data = df)
m_sc <- update(m_unsc, data = transform(df, ind.2 = drop(scale(df$ind.2))))

Compare coefficients:

 cbind(coef(m_unsc), coef(m_sc))
                   [,1]        [,2]
(Intercept) -5.50449624 -5.13543854
ind.1        0.24445805  0.24445805
ind.2        0.03662308  0.06366992

Check equivalence (we divide the scaled coefficient by the scaling factor (sc=sd(ind.2)) to get back the unscaled coefficient):

all.equal(coef(m_sc)["ind.2"]/sc, coef(m_unsc)["ind.2"])

The negative binomial model uses a log link, not a logit link, so if you want to back-transform the coefficient to get proportional or "fold" changes per unit of ind2:

exp(coef(m_sc)["ind.2"]/sc)

this gives 1.0373, a 4% change in the response per unit change in ind.2 (you can confirm that it's the same as exponentiating the unscaled coefficient).

Note that 2/3 of the answers in the linked question, including the currently accepted answer, are wrong: you should be dividing the scaled coefficient by the scaling factor, not multiplying ...

unscale predictor coefficients lmer model fit with an unscaled response

The scale function does a z-transform of the data, which means it takes the original values, subtracts the mean, and then divides by the standard deviation.

to_scale <- 1:10
using_scale <- scale(to_scale, center = TRUE, scale = TRUE)
by_hand <- (to_scale - mean(to_scale))/sd(to_scale)
identical(as.numeric(using_scale), by_hand)
[1] TRUE

Therefore, to reverse the model coefficients all you need to do is multiply the coefficient by the standard deviation of the covariate and add the mean. The scale function holds onto the mean and sd for you. So, if we assume that your covariate values are the using_scale vector for the scale.t6 regression coefficient we can write a function to do the work for us.

get_real <- function(coef, scaled_covariate){

            # collect mean and standard deviation from scaled covariate
            mean_sd <- unlist(attributes(scaled_covariate)[-1])

            # reverse the z-transformation
             answer <- (coef * mean_sd[2]) + mean_sd[1]

            # this value will have a name, remove it
             names(answer) <- NULL

            # return unscaled coef
              return(answer)
}

get_real(-0.3440, using_scale)
[1] 4.458488

In other words, it is the same thing as unscaling your predictor variables because it is a monotonic transformation.

Standardizing regression coefficients changed significance

tldr; The p-values characterising the statistical significance of parameters in a linear model may change following scaling (standardising) variables.

As an example, I will work with the mtcars dataset, and regress mpg on disp and drat; or in R's formula language mpg ~ disp + drat.

1. Three linear models

We implement three different (OLS) linear models, the difference being different scaling strategies of the variables.

To start, we don't do any scaling.

m1 <- lm(mpg ~ disp + drat, data = mtcars)

Next, we scale values using scale which by default does two things: (1) It centers values at 0 by subtracting the mean, and (2) it scales values to have unit variance by dividing the (centered) values by their standard deviation.
```
m2 <- lm(mpg ~ disp + drat, data = as.data.frame(scale(mtcars)))
```
Note that we can apply scale to the data.frame directly, which will scale values by column. scale returns a matrix so we need to transform the resulting object back to a data.frame.
Finally, we scale values using scale without centering, but scaling values to have unit variance
```
m3 <- lm(mpg ~ disp + drat, data = as.data.frame(scale(mtcars, center = F)))
```

2. Comparison of parameter estimates and statistical significance

Let's inspect the parameter estimates for m1

summary(m1)$coef
#               Estimate  Std. Error   t value     Pr(>|t|)
#(Intercept) 21.84487993 6.747971087  3.237252 3.016655e-03
#disp        -0.03569388 0.006652672 -5.365345 9.191388e-06
#drat         1.80202739 1.542091386  1.168561 2.520974e-01

We get the t values from the ratio of the parameter estimates and standard errors; the p-values then follow from the area under the curve of the pdf for df = nrow(mtcars) - 3 (as we have 3 parameters) where x > |t| (corresponding to a two-sided t-test). So for example, for disp we confirm the t value

summary(m1)$coef["disp", "Estimate"] / summary(m1)$coef["disp", "Std. Error"]
#[1] -5.365345

and the p-value

2 * pt(summary(m1)$coef["disp", "Estimate"] / summary(m1)$coef["disp", "Std. Error"], nrow(mtcars) - 3)
#[1] 9.191388e-06

Let's take a look at results from m2:

summary(m2)$coef
#                 Estimate Std. Error       t value     Pr(>|t|)
#(Intercept) -1.306994e-17 0.09479281 -1.378790e-16 1.000000e+00
#disp        -7.340121e-01 0.13680614 -5.365345e+00 9.191388e-06
#drat         1.598663e-01 0.13680614  1.168561e+00 2.520974e-01

Notice how the t values (i.e. the ratios of the estimates and standard errors) are different compared to those of m1, due to the centering and scaling of data to have unit variance.

If however, we don't center values and only scale them to have unit variance

summary(m3)$coef
#              Estimate Std. Error   t value     Pr(>|t|)
#(Intercept)  1.0263872 0.31705513  3.237252 3.016655e-03
#disp        -0.4446985 0.08288348 -5.365345 9.191388e-06
#drat         0.3126834 0.26757994  1.168561 2.520974e-01

we can see that while estimates and standard errors are different compared to (unscaled) results from m1, their respective ratios (i.e. the t values) are identical. So (default) scale(...) will change the statistical significance of parameter estimates while scale(..., center = FALSE) will not.

It's easy to see why dividing values by their standard deviation does not change the ratio of OLS parameter estimates and standard errors when taking a look at the closed form for the OLS parameter estimate and standard error, see e.g. here.

Scale Back Linear Regression Coefficients in R from Scaled and Centered Data