How to Unscale the Coefficients from an Lmer()-Model Fitted with a Scaled Response

How to unscale the coefficients from an lmer()-model fitted with a scaled response

Updated: generalized to allow for scaling of the response as well as the predictors.

Here's a fairly crude implementation.

If our original (unscaled) regression is

Y = b0 + b1*x1 + b2*x2 ...

Then our scaled regression is

(Y0-mu0)/s0 = b0' + (b1'*(1/s1*(x1-mu1))) + b2'*(1/s2*(x2-mu2))+ ...

This is equivalent to

Y0 = mu0 + s0((b0'-b1'/s1*mu1-b2'/s2*mu2 + ...) + b1'/s1*x1 + b2'/s2*x2 + ...)

So bi = s0*bi'/si for i>0 and

b0 = s0*b0'+mu0-sum(bi*mui)

Implement this:

 rescale.coefs <- function(beta,mu,sigma) {
    beta2 <- beta ## inherit names etc.
    beta2[-1] <- sigma[1]*beta[-1]/sigma[-1]
    beta2[1]  <- sigma[1]*beta[1]+mu[1]-sum(beta2[-1]*mu[-1])
    beta2
 }

Try it out for a linear model:

m1 <- lm(Illiteracy~.,as.data.frame(state.x77))
b1 <- coef(m1)

Make a scaled version of the data:

ss <- scale(state.x77)

Scaled coefficients:

m1S <- update(m1,data=as.data.frame(ss))
b1S <- coef(m1S)

Now try out rescaling:

icol <- which(colnames(state.x77)=="Illiteracy")
p.order <- c(icol,(1:ncol(state.x77))[-icol])
m <- colMeans(state.x77)[p.order]
s <- apply(state.x77,2,sd)[p.order]
all.equal(b1,rescale.coefs(b1S,m,s))  ## TRUE

This assumes that both the response and the predictors are scaled.

If you scale only the response and not the predictors, then you should submit (c(mean(response),rep(0,...)) for m and c(sd(response),rep(1,...)) for s (i.e., m and s are the values by which the variables were shifted and scaled).
If you scale only the predictors and not the response, then submit c(0,mean(predictors)) for m and c(1,sd(predictors)) for s.

unscale predictor coefficients lmer model fit with an unscaled response

The scale function does a z-transform of the data, which means it takes the original values, subtracts the mean, and then divides by the standard deviation.

to_scale <- 1:10
using_scale <- scale(to_scale, center = TRUE, scale = TRUE)
by_hand <- (to_scale - mean(to_scale))/sd(to_scale)
identical(as.numeric(using_scale), by_hand)
[1] TRUE

Therefore, to reverse the model coefficients all you need to do is multiply the coefficient by the standard deviation of the covariate and add the mean. The scale function holds onto the mean and sd for you. So, if we assume that your covariate values are the using_scale vector for the scale.t6 regression coefficient we can write a function to do the work for us.

get_real <- function(coef, scaled_covariate){

            # collect mean and standard deviation from scaled covariate
            mean_sd <- unlist(attributes(scaled_covariate)[-1])

            # reverse the z-transformation
             answer <- (coef * mean_sd[2]) + mean_sd[1]

            # this value will have a name, remove it
             names(answer) <- NULL

            # return unscaled coef
              return(answer)
}

get_real(-0.3440, using_scale)
[1] 4.458488

In other words, it is the same thing as unscaling your predictor variables because it is a monotonic transformation.

Back-transform coefficients from glmer with scaled independent variables for prediction

The problem with your approach is that it only "unscales" based on the wt variable, whereas you scaled all of the variables in your regression model. One approach that works is to adjust all of the variables in your new (prediction) data frame using the centering/scaling values that were used on the original data frame:

## scale variable x using center/scale attributes
## of variable y
scfun <- function(x,y) {
    scale(x, 
          center=attr(y,"scaled:center"), 
          scale=attr(y,"scaled:scale"))
}
## scale prediction frame
xweight_sc <- transform(xweight,
                        wt = scfun(wt, database$wt),
                        am = scfun(am, database$am))
## predict
p_unsc <- predict(model.1, 
                  newdata=xweight_sc, 
                  type="response", re.form=NA)

Comparing this p_unsc to your predictions from the unscaled model (b in your code), i.e. all.equal(b,p_unsc), gives TRUE.

Another reasonable approach would be to

unscale/uncenter all of your parameters using the "unscaling" approaches presented in one of the linked question (such as this one), generating a coefficient vector beta_unsc
construct the appropriate model matrix from your prediction frame:

X <- model.matrix(formula(model,fixed.only=TRUE), 
         newdata=pred_frame)

compute the linear predictor and back-transform:

pred <- plogis(X %*% beta_unsc)

unscale and uncenter glmer parameters

Read in data:

source("SO_unscale.txt")

Separate unscaled and scaled variables (Transmitter.depth doesn't appear to have a scaled variant)

unsc.vars <- subset(dd,select=c(Transmitter.depth,
                       receiver.depth,water.temperature,
                       wind.speed,distance))
sc.vars <- subset(dd,select=c(Transmitter.depth,
                     Receiver.depth,Water.temperature,
                     Wind.speed,Distance))

I noticed that the means and standard deviations of the scaled variables were not exactly 0/1, perhaps because what's here is a subset of the data. In any case, we will need the means and standard deviations of the original data in order to unscale.

colMeans(sc.vars)
apply(sc.vars,2,sd)
cm <- colMeans(unsc.vars)
csd <- apply(unsc.vars,2,sd)

It is possible to 'unscale' even if the new variables are not exactly centered/scaled (one would just need to enter the actual amount of the shift/scaling done), but it's marginally more complicated, so I'm just going to go ahead and fit with precisely centered/scaled variables.

## changed data name to dd
library(lme4)
cs. <- function(x) scale(x,center=TRUE,scale=TRUE)
m1 <- glmer(Valid.detections ~ Transmitter.depth +
            receiver.depth + water.temperature + 
            wind.speed + distance + (distance | SUR.ID),
            data=dd, family = poisson,
            control=glmerControl(optimizer=c("bobyqa","Nelder_Mead")))
## FAILS with bobyqa alone
m1.sc <- glmer(Valid.detections ~ cs.(Transmitter.depth) +
               cs.(receiver.depth) + cs.(water.temperature) + 
               cs.(wind.speed) + cs.(distance) + (cs.(distance) | SUR.ID),
               data=dd, family = poisson,
               control=glmerControl(optimizer=c("bobyqa","Nelder_Mead")))

An important point is that in this case the very different scaling doesn't seem to do any harm; the scaled and unscaled model get essentially the same goodness of fit (if it were important, we would expect the scaled fit to do better)

logLik(m1)-logLik(m1.sc)  ## 1e-7

Here is the rescaling function given in a previous answer:

rescale.coefs <- function(beta,mu,sigma) {
    beta2 <- beta ## inherit names etc.
    beta2[-1] <- sigma[1]*beta[-1]/sigma[-1]
    beta2[1]  <- sigma[1]*beta[1]+mu[1]-sum(beta2[-1]*mu[-1])
    beta2
}

The parameters do indeed match very closely. (The shifting/scaling vectors include possible scaling/shifting of the response variable, so we start with 0/1 since the response is not scaled [it would rarely make sense to scale a response variable for a GLMM, but this function can be useful for LMMs too].)

(cc <- rescale.coefs(fixef(m1.sc),mu=c(0,cm),sigma=c(1,csd)))
##            (Intercept) cs.(Transmitter.depth)    cs.(receiver.depth) 
##            3.865879406            0.011158402           -0.554392645 
## cs.(water.temperature)        cs.(wind.speed)          cs.(distance) 
##           -0.050833325           -0.042188495           -0.007231021 

fixef(m1)
##  (Intercept) Transmitter.depth    receiver.depth water.temperature 
##  3.865816422       0.011180213      -0.554498582      -0.050830611 
##   wind.speed          distance 
## -0.042179333      -0.007231004

Since they're the same (since the unscaled model does fit OK), we could use either set for this calculation.

ddist <- 1:1000
vals <- cbind(`(Intercept)`=1,Transmitter.depth=0.6067926,
          Receiver.depth=-0.1610828,Water.temperature=-0.1128282,
          Wind.speed=-0.2959290,distance=ddist)
pred.obs <- exp(cc %*% t(vals))
max(ddist[pred.obs>1])

Now suppose you want to do similar scaling/unscaling for a model with interactions or other complexities (i.e. the predictor variables, the columns of the fixed-effect model matrix, are not the same as the input variables, which are the variables that appear in the formula)

m2 <- update(m1,. ~ . + wind.speed:distance)
m2.sc <- update(m1.sc,. ~ . + I(cs.(wind.speed*distance)))
logLik(m2)-logLik(m2.sc)

Calculate mean/sd of model matrix, dropping the first (intercept) value:

X <- getME(m2,"X")                                        
cm2 <- colMeans(X)[-1]
csd2 <- apply(X,2,sd)[-1]                                            
(cc2 <- rescale.coefs(fixef(m2.sc),mu=c(0,cm2),sigma=c(1,csd2)))
all.equal(unname(cc2),unname(fixef(m2)),tol=1e-3)  ## TRUE

You don't actually have to fit the full unscaled model just to get the scaling parameters: you could use model.matrix([formula],data) to derive the model matrix. That is, if you haven't already fitted m2 and you want to get X to get the column means and standard deviations, i.e.

X <- model.matrix(Valid.detections ~ Transmitter.depth + receiver.depth +
                      water.temperature + 
                      wind.speed + distance + 
                      wind.speed:distance,
                  data=dd)

If you have a LMM/have scaled the response variable, you should also multiply all of the standard deviations (including the residual error, sigma(fitted_model)) by the original SD of the response variable.

Scale back linear regression coefficients in R from scaled and centered data

If I understand your description (that is unfortunately at the moment code-free), you are getting standardized regression coefficients for Y ~ As + Bs*Xs where all those "s" items are scaled variables. The coefficients then are the predicted change on a std deviation scale of Y associated with a change in X of one standard deviation of X. The scale function would have recorded the means and standard deviations in attributes for hte scaled object. If not, then you will have those estimates somewhere in your console log. The estimated change in dY for a change dX in X should be: dY*(1/sdY) = Bs*dX*(1/sdX). Predictions should be something along these lines:

Yest = As*(sdX) + Xmn + Bs*(Xs)*(sdX)

You probably should not have needed to standardize the Y values, and I'm hoping that you didn't because it makes dealing with the adjustment for the means of the X's easier. Put some code and example data in if you want implemented and checked answers. I think @DanielGerlance is correct in saying to multiply rather than divide by the SD's.

How to back-transform with a continuous variable

When you print out the model by typing m1, this part:

    Fixed Effects:
   (Intercept)            sexM       BirthDate  
      -0.08661         0.20718         0.43022

tells you the slopes, i.e. how much the result will change based on a change in the inputs. In particular, if you increase BirthDate by one (and keep everything else the same), the predicted activity score will increase by 0.43022.

You do not provide any data so I cannot work directly with your data and your model. Instead, I will illustrate with some data built into R, the iris data.

## Build a linear model
Mod1 = lm(Petal.Length ~ ., data=iris[,1:4])

Now we could just type Mod1, but that gives more than I care to see. We can restrict our attention to the interesting part using

Mod1$coefficients
 (Intercept) Sepal.Length  Sepal.Width  Petal.Width 
  -0.2627112    0.7291384   -0.6460124    1.4467934

This gives the slope for each of the predictor variables (and the intercept).
I want to illustrate how the response Petal.Length varies with the inputs.
I will just take some point and change one predictor and look at the result.

NewPoint = iris[30,1:4]
NewPoint[,1] = NewPoint[,1]+1
iris[30, 1:4]
   Sepal.Length Sepal.Width Petal.Length Petal.Width
30          4.7         3.2          1.6         0.2
NewPoint
   Sepal.Length Sepal.Width Petal.Length Petal.Width
30          5.7         3.2          1.6         0.2

You can see that NewPoint is the same as the original point iris[30,1:4]
except that Sepal.Length has been increased by 1. How does that affect the prediction?

predict(Mod1, newdata=iris[30,1:4])
      30 
1.386358 
predict(Mod1, newdata=NewPoint)
      30 
2.115497 
predict(Mod1, newdata=NewPoint) - predict(Mod1, newdata=iris[30,1:4])
       30 
0.7291384

The difference in the predicted values is 0.7291384, which is the coefficient for Sepal.Length shown above.

How to Unscale the Coefficients from an Lmer()-Model Fitted with a Scaled Response