Linear Models in R with Different Combinations of Variables

Linear models in R with different combinations of variables

Here's one way to get all of the combinations of variables using the combn function. It's a bit messy, and uses a loop (perhaps someone can improve on this with mapply):

vars <- c("price","model","size","year","color")
N <- list(1,2,3,4)
COMB <- sapply(N, function(m) combn(x=vars[2:5], m))
COMB2 <- list()
k=0
for(i in seq(COMB)){
tmp <- COMB[[i]]
for(j in seq(ncol(tmp))){
k <- k + 1
COMB2[[k]] <- formula(paste("price", "~", paste(tmp[,j], collapse=" + ")))
}
}

Then, you can call these formulas and store the model objects using a list or possibly give unique names with the assign function:

res <- vector(mode="list", length(COMB2))
for(i in seq(COMB2)){
res[[i]] <- lm(COMB2[[i]], data=data)
}

How to run lm models using all possible combinations of several variables and a factor

I suspect the dredge function in the MuMIn package would help you. You specify a "full" model with all parameters you want to include and then run dredge(fullmodel) to get all combinations nested within the full model.

You should then be able to get the coefficients and AIC values from the results of this.

Something like:

require(MuMIn)
data(iris)

globalmodel <- lm(Sepal.Length ~ Petal.Length + Petal.Width + Species, data = iris)

combinations <- dredge(globalmodel)

print(combinations)

to get the parameter estimates for all models (a bit messy) you can then use

coefTable(combinations)

or to get the coefficients for a particular model you can index that using the row number in the dredge object, e.g.

coefTable(combinations)[1]

to get the coefficients in the model at row 1. This should also print coefficients for factor levels.

See the MuMIn helpfile for more details and ways to extract information.

Hope that helps.

different possible combinations of variables for a generalized linear model

This is called dredging:

library(MuMIn)
data(Cement)
fm1 <- lm(y ~ ., data = Cement)
dd <- dredge(fm1)

Global model call: lm(formula = y ~ ., data = Cement)
---
Model selection table
(Intrc) X1 X2 X3 X4 df logLik AICc delta weight
4 52.58 1.468 0.6623 4 -28.156 69.3 0.00 0.566
12 71.65 1.452 0.4161 -0.2365 5 -26.933 72.4 3.13 0.119
8 48.19 1.696 0.6569 0.2500 5 -26.952 72.5 3.16 0.116
10 103.10 1.440 -0.6140 4 -29.817 72.6 3.32 0.107
14 111.70 1.052 -0.4100 -0.6428 5 -27.310 73.2 3.88 0.081
15 203.60 -0.9234 -1.4480 -1.5570 5 -29.734 78.0 8.73 0.007
16 62.41 1.551 0.5102 0.1019 -0.1441 6 -26.918 79.8 10.52 0.003
13 131.30 -1.2000 -0.7246 4 -35.372 83.7 14.43 0.000
7 72.07 0.7313 -1.0080 4 -40.965 94.9 25.62 0.000
9 117.60 -0.7382 3 -45.872 100.4 31.10 0.000
3 57.42 0.7891 3 -46.035 100.7 31.42 0.000
11 94.16 0.3109 -0.4569 4 -45.761 104.5 35.21 0.000
2 81.48 1.869 3 -48.206 105.1 35.77 0.000
6 72.35 2.312 0.4945 4 -48.005 109.0 39.70 0.000
5 110.20 -1.2560 3 -50.980 110.6 41.31 0.000
1 95.42 2 -53.168 111.5 42.22 0.000

Linear models in R with different combinations of data frame variables

If I understand correctly, your formula is wrong,
your predictors (independents) should be the 7 columns you mention.
I'm not sure if "all possible combinations" is exactly what you want,
maybe you only want second-order interactions (and no intercept due to the -1)?
In that case, you can probably do the following
(see also this question):

fit <- lm(sprintf("cbind(%s) ~ . ^ 2 - 1",
toString(paste("act1", 1:144, sep = "_"))),
data = DVType)

How to run all possible combinations in multiple linear regression model in R

Generate example data:

dat <- data.frame(
Y = rnorm(100),
X_1 = rnorm(100),
X_2 = rnorm(100),
X_3 = rnorm(100),
X_4 = rnorm(100),
X_5 = rnorm(100),
X_6 = rnorm(100),
X_7 = rnorm(100)
)

Find all 1 through 7 combinations of variables and paste them into a formula with Y as dependent variable:

variables <- colnames(dat)[2:ncol(dat)]
formulas <- list()
for (i in seq_along(variables)) {
tmp <- combn(variables, i)
tmp <- apply(tmp, 2, paste, collapse="+")
tmp <- paste0("Y~", tmp)
formulas[[i]] <- tmp
}
formulas <- unlist(formulas)
formulas <- sapply(formulas, as.formula)

Estimate 127 regression models:

models <- lapply(formulas, lm, data=dat)

How can I perform and store linear regression models between all continuous variables in a data frame?

Assuming you need pairwise comparisons between all columns of mtcars, you can use combn() function to find all pairwise comparisons (2), and perform all linear models with:

combinations <- combn(colnames(mtcars), 2)

forward <- list()
reverse <- list()

for(i in 1:ncol(combinations)){
forward[[i]] <- lm(formula(paste0(combinations[,i][1], "~", combinations[,i][2])), data = mtcars)

reverse[[i]] <- lm(formula(paste0(combinations[,i][2], "~", combinations[,i][1])), data = mtcars)
}

all <- c(forward, reverse)

all will be your list with all of the linear models together, with both forward and reverse directions of associations between the two variables.

If you want combinations between three variables, you can do combn(colnames(mtcars), 3), and so on.

Run all posible combination of linear regression with 2 independent variables

As mentioned in the comments under the question check whether you need y or Y. Having addressed that we can use any of these. There is no need to rename the columns. We use the built in mtcars data set as an example since no test data was provided in the question. (Please always provide that in the future.)

1) ExhaustiveSearch This runs quite fast so you might be able to try combinations higher than 2 as well.

library(ExhaustiveSearch)
ExhaustiveSearch(mpg ~., mtcars, combsUpTo = 2)

2) combn Use the lmfun function defined below with combn.

dep <- "mpg"  # name of dependent variable
nms <- setdiff(names(mtcars), dep) # names of indep variables
lmfun <- function(x, dep) do.call("lm", list(reformulate(x, dep), quote(mtcars)))

lms <- combn(nms, 2, lmfun, dep = dep, simplify = FALSE)
names(lms) <- lapply(lms, formula)

3) listcompr Using lmfun from above and listcompr we can use the following. Note that we need version 0.1.1 or later of listcompr which is not yet on CRAN so we get it from github.

# install.github("patrickroocks/listcompr")
library(listcompr)
packageVersion("listcompr") # need version 0.1.1 or later

dep <- "mpg" # name of dependent variable
nms <- setdiff(names(mtcars), dep) # names of indep variables

lms2 <- gen.named.list("{nm1}.{nm2}", lmfun(c(nm1, nm2), dep),
nm1 = nms, nm2 = nms, nm1 < nm2)

Write a function to list all possible combinations of models

I am not aware of any packages that allow one to automate this. So, let's try a brute force approach. The idea is to generate all possible combinations by hand and iterate over them.

vars <- names(mtcars)[-1]

models <- list()

for (i in 1:5){
vc <- combn(vars,i)
for (j in 1:ncol(vc)){
model <- as.formula(paste0("mpg ~", paste0(vc[,j], collapse = "+")))
models <- c(models, model)
}
}

You can use these formulas for run the linear model.

lapply(models, function(x) lm(x, data = mtcars))


Related Topics



Leave a reply



Submit