Linear models in R with different combinations of variables
Here's one way to get all of the combinations of variables using the combn
function. It's a bit messy, and uses a loop (perhaps someone can improve on this with mapply
):
vars <- c("price","model","size","year","color")
N <- list(1,2,3,4)
COMB <- sapply(N, function(m) combn(x=vars[2:5], m))
COMB2 <- list()
k=0
for(i in seq(COMB)){
tmp <- COMB[[i]]
for(j in seq(ncol(tmp))){
k <- k + 1
COMB2[[k]] <- formula(paste("price", "~", paste(tmp[,j], collapse=" + ")))
}
}
Then, you can call these formulas and store the model objects using a list
or possibly give unique names with the assign
function:
res <- vector(mode="list", length(COMB2))
for(i in seq(COMB2)){
res[[i]] <- lm(COMB2[[i]], data=data)
}
How to run lm models using all possible combinations of several variables and a factor
I suspect the dredge
function in the MuMIn package would help you. You specify a "full" model with all parameters you want to include and then run dredge(fullmodel)
to get all combinations nested within the full model.
You should then be able to get the coefficients and AIC values from the results of this.
Something like:
require(MuMIn)
data(iris)
globalmodel <- lm(Sepal.Length ~ Petal.Length + Petal.Width + Species, data = iris)
combinations <- dredge(globalmodel)
print(combinations)
to get the parameter estimates for all models (a bit messy) you can then use
coefTable(combinations)
or to get the coefficients for a particular model you can index that using the row number in the dredge object, e.g.
coefTable(combinations)[1]
to get the coefficients in the model at row 1. This should also print coefficients for factor levels.
See the MuMIn helpfile for more details and ways to extract information.
Hope that helps.
different possible combinations of variables for a generalized linear model
This is called dredging:
library(MuMIn)
data(Cement)
fm1 <- lm(y ~ ., data = Cement)
dd <- dredge(fm1)
Global model call: lm(formula = y ~ ., data = Cement)
---
Model selection table
(Intrc) X1 X2 X3 X4 df logLik AICc delta weight
4 52.58 1.468 0.6623 4 -28.156 69.3 0.00 0.566
12 71.65 1.452 0.4161 -0.2365 5 -26.933 72.4 3.13 0.119
8 48.19 1.696 0.6569 0.2500 5 -26.952 72.5 3.16 0.116
10 103.10 1.440 -0.6140 4 -29.817 72.6 3.32 0.107
14 111.70 1.052 -0.4100 -0.6428 5 -27.310 73.2 3.88 0.081
15 203.60 -0.9234 -1.4480 -1.5570 5 -29.734 78.0 8.73 0.007
16 62.41 1.551 0.5102 0.1019 -0.1441 6 -26.918 79.8 10.52 0.003
13 131.30 -1.2000 -0.7246 4 -35.372 83.7 14.43 0.000
7 72.07 0.7313 -1.0080 4 -40.965 94.9 25.62 0.000
9 117.60 -0.7382 3 -45.872 100.4 31.10 0.000
3 57.42 0.7891 3 -46.035 100.7 31.42 0.000
11 94.16 0.3109 -0.4569 4 -45.761 104.5 35.21 0.000
2 81.48 1.869 3 -48.206 105.1 35.77 0.000
6 72.35 2.312 0.4945 4 -48.005 109.0 39.70 0.000
5 110.20 -1.2560 3 -50.980 110.6 41.31 0.000
1 95.42 2 -53.168 111.5 42.22 0.000
Linear models in R with different combinations of data frame variables
If I understand correctly, your formula is wrong,
your predictors (independents
) should be the 7 columns you mention.
I'm not sure if "all possible combinations" is exactly what you want,
maybe you only want second-order interactions (and no intercept due to the -1
)?
In that case, you can probably do the following
(see also this question):
fit <- lm(sprintf("cbind(%s) ~ . ^ 2 - 1",
toString(paste("act1", 1:144, sep = "_"))),
data = DVType)
How to run all possible combinations in multiple linear regression model in R
Generate example data:
dat <- data.frame(
Y = rnorm(100),
X_1 = rnorm(100),
X_2 = rnorm(100),
X_3 = rnorm(100),
X_4 = rnorm(100),
X_5 = rnorm(100),
X_6 = rnorm(100),
X_7 = rnorm(100)
)
Find all 1 through 7 combinations of variables and paste them into a formula with Y as dependent variable:
variables <- colnames(dat)[2:ncol(dat)]
formulas <- list()
for (i in seq_along(variables)) {
tmp <- combn(variables, i)
tmp <- apply(tmp, 2, paste, collapse="+")
tmp <- paste0("Y~", tmp)
formulas[[i]] <- tmp
}
formulas <- unlist(formulas)
formulas <- sapply(formulas, as.formula)
Estimate 127 regression models:
models <- lapply(formulas, lm, data=dat)
How can I perform and store linear regression models between all continuous variables in a data frame?
Assuming you need pairwise comparisons between all columns of mtcars
, you can use combn()
function to find all pairwise comparisons (2), and perform all linear models with:
combinations <- combn(colnames(mtcars), 2)
forward <- list()
reverse <- list()
for(i in 1:ncol(combinations)){
forward[[i]] <- lm(formula(paste0(combinations[,i][1], "~", combinations[,i][2])), data = mtcars)
reverse[[i]] <- lm(formula(paste0(combinations[,i][2], "~", combinations[,i][1])), data = mtcars)
}
all <- c(forward, reverse)
all
will be your list with all of the linear models together, with both forward and reverse directions of associations between the two variables.
If you want combinations between three variables, you can do combn(colnames(mtcars), 3)
, and so on.
Run all posible combination of linear regression with 2 independent variables
As mentioned in the comments under the question check whether you need y or Y. Having addressed that we can use any of these. There is no need to rename the columns. We use the built in mtcars
data set as an example since no test data was provided in the question. (Please always provide that in the future.)
1) ExhaustiveSearch This runs quite fast so you might be able to try combinations higher than 2 as well.
library(ExhaustiveSearch)
ExhaustiveSearch(mpg ~., mtcars, combsUpTo = 2)
2) combn Use the lmfun
function defined below with combn
.
dep <- "mpg" # name of dependent variable
nms <- setdiff(names(mtcars), dep) # names of indep variables
lmfun <- function(x, dep) do.call("lm", list(reformulate(x, dep), quote(mtcars)))
lms <- combn(nms, 2, lmfun, dep = dep, simplify = FALSE)
names(lms) <- lapply(lms, formula)
3) listcompr Using lmfun
from above and listcompr we can use the following. Note that we need version 0.1.1 or later of listcompr which is not yet on CRAN so we get it from github.
# install.github("patrickroocks/listcompr")
library(listcompr)
packageVersion("listcompr") # need version 0.1.1 or later
dep <- "mpg" # name of dependent variable
nms <- setdiff(names(mtcars), dep) # names of indep variables
lms2 <- gen.named.list("{nm1}.{nm2}", lmfun(c(nm1, nm2), dep),
nm1 = nms, nm2 = nms, nm1 < nm2)
Write a function to list all possible combinations of models
I am not aware of any packages that allow one to automate this. So, let's try a brute force approach. The idea is to generate all possible combinations by hand and iterate over them.
vars <- names(mtcars)[-1]
models <- list()
for (i in 1:5){
vc <- combn(vars,i)
for (j in 1:ncol(vc)){
model <- as.formula(paste0("mpg ~", paste0(vc[,j], collapse = "+")))
models <- c(models, model)
}
}
You can use these formulas for run the linear model.
lapply(models, function(x) lm(x, data = mtcars))
Related Topics
Rjava Is Not Picking Up the Correct Java Version
Clustered Standard Errors in R Using Plm (With Fixed Effects)
Change Color Actionbutton Shiny R
Apply Tidyr::Separate Over Multiple Columns
Clear Memory Allocated by R Session (Gc() Doesnt Help !)
R: Save Multiple Plots from a File List into a Single File (Png or PDF or Other Format)
Offline Installation of R Packages
Reshaping Several Variables Wide with Cast
Ggplot2 Bar Plot with Two Categorical Variables
Adding Scale Bar to Ggplot Map
How to Sort a Matrix by All Columns
Taking a Disproportionate Sample from a Dataset in R
How to Do Gaussian Elimination in R (Do Not Use "Solve")
R Ggplot2 Center Align a Multi-Line Title
Reshaping a Data Frame with More Than One Measure Variable
Get Name of X When Defining '(<-' Operator