Perform Operation on Each Imputed Dataset in R's Mice

Perform operation on each imputed dataset in R's MICE

Another option is to calculate the variables before the imputation and place restrictions on them.

library(mice)

# Create the additional variable - this will have missing
nhanes$extra <- nhanes$chl / 2

# Change the method of imputation for extra, so that it always equals chl/2
# Change the predictor matrix so only chl predicts extra
ini <- mice(nhanes, max = 0, print = FALSE)

meth <- ini$meth
meth["extra"] <- "~I(chl / 2)"

pred <- ini$pred # extra isn't used to predict
pred["extra", "chl"] <- 1

# Imputations
imput <- mice(nhanes, seed = 1, pred = pred, meth = meth, print = FALSE)

There are examples in mice: Multivariate Imputation by Chained Equations in R.

Use lapply function on imputed datasets (MICE)

I am not sure if this is what exactly you were trying to do but here are few suggestions :

  • tempData is not a dataframe (tempData$data is) so you cannot directly subset it.
  • I use reformulate here to create formula which is applied in lm
  • Instead of looping over columns values in lapply, I loop over column names which also makes it easy to construct formula.

So try :

variables_subset<-c("Ozone","Solar.R", "Temp")
lapply(variables_subset,function(x)
lm(reformulate("Wind", x), data = tempData$data))

#[[1]]

#Call:
#lm(formula = reformulate("Wind", x), data = tempData$data)

#Coefficients:
#(Intercept) Wind
# 99.166 -5.782

#[[2]]

#Call:
#lm(formula = reformulate("Wind", x), data = tempData$data)

#Coefficients:
#(Intercept) Wind
# 189.5896 -0.3649

#[[3]]

#Call:
#lm(formula = reformulate("Wind", x), data = tempData$data)

#Coefficients:
#(Intercept) Wind
# 89.982 -1.142

To get nested list using imputed datasets you can try :

dat <- mice::complete(tempData, "long", inc = TRUE)

model_list <- lapply(split(dat, dat$.imp), function(x) {
lapply(variables_subset,function(y)
lm(reformulate("Wind", y), data = x))
})


Related Topics



Leave a reply



Submit