Extract Regression Coefficient Values

Extract regression coefficient values

A summary.lm object stores these values in a matrix called 'coefficients'. So the value you are after can be accessed with:

a2Pval <- summary(mg)$coefficients[2, 4]

Or, more generally/readably, coef(summary(mg))["a2","Pr(>|t|)"]. See here for why this method is preferred.

How to extract the regression coefficient from statsmodels.api?

You can use the params property of a fitted model to get the coefficients.

For example, the following code:

import statsmodels.api as sm
import numpy as np
np.random.seed(1)
X = sm.add_constant(np.arange(100))
y = np.dot(X, [1,2]) + np.random.normal(size=100)
result = sm.OLS(y, X).fit()
print(result.params)

will print you a numpy array [ 0.89516052 2.00334187] - estimates of intercept and slope respectively.

If you want more information, you can use the object result.summary() that contains 3 detailed tables with model description.

How to extract the coefficients of a linear model and store in a variable in R?


df <- mtcars
fit <- lm(mpg~., data = df)

beta_0 = fit$coefficients[1]

#base R approach
coef_base <- coef(fit)
coef_base
#> (Intercept) cyl disp hp drat wt
#> 12.30337416 -0.11144048 0.01333524 -0.02148212 0.78711097 -3.71530393
#> qsec vs am gear carb
#> 0.82104075 0.31776281 2.52022689 0.65541302 -0.19941925


#tidyverse approach with the broom package
coef_tidy <- broom::tidy(fit)
coef_tidy
#> # A tibble: 11 x 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 12.3 18.7 0.657 0.518
#> 2 cyl -0.111 1.05 -0.107 0.916
#> 3 disp 0.0133 0.0179 0.747 0.463
#> 4 hp -0.0215 0.0218 -0.987 0.335
#> 5 drat 0.787 1.64 0.481 0.635
#> 6 wt -3.72 1.89 -1.96 0.0633
#> 7 qsec 0.821 0.731 1.12 0.274
#> 8 vs 0.318 2.10 0.151 0.881
#> 9 am 2.52 2.06 1.23 0.234
#> 10 gear 0.655 1.49 0.439 0.665
#> 11 carb -0.199 0.829 -0.241 0.812

for (i in coef_base) {
#do work on i
print(i)
}
#> [1] 12.30337
#> [1] -0.1114405
#> [1] 0.01333524
#> [1] -0.02148212
#> [1] 0.787111
#> [1] -3.715304
#> [1] 0.8210407
#> [1] 0.3177628
#> [1] 2.520227
#> [1] 0.655413
#> [1] -0.1994193

pull out p-values and r-squared from a linear regression

r-squared: You can return the r-squared value directly from the summary object summary(fit)$r.squared. See names(summary(fit)) for a list of all the items you can extract directly.

Model p-value: If you want to obtain the p-value of the overall regression model,
this blog post outlines a function to return the p-value:

lmp <- function (modelobject) {
if (class(modelobject) != "lm") stop("Not an object of class 'lm' ")
f <- summary(modelobject)$fstatistic
p <- pf(f[1],f[2],f[3],lower.tail=F)
attributes(p) <- NULL
return(p)
}

> lmp(fit)
[1] 1.622665e-05

In the case of a simple regression with one predictor, the model p-value and the p-value for the coefficient will be the same.

Coefficient p-values: If you have more than one predictor, then the above will return the model p-value, and the p-value for coefficients can be extracted using:

summary(fit)$coefficients[,4]  

Alternatively, you can grab the p-value of coefficients from the anova(fit) object in a similar fashion to the summary object above.

Extracting coefficients from a regression in R

You may use the names()

data(mtcars)
fit <- lm(mpg ~ wt, mtcars)

names(summary(fit))

names(summary(fit))
[1] "call" "terms" "residuals" "coefficients" "aliased" "sigma" "df" "r.squared"
[9] "adj.r.squared" "fstatistic" "cov.unscaled"

Then

Intercept:

summary(fit)$coefficients[1,1]

Slope:

summary(fit)$coefficients[2,1]

Extract regression coefficients out of large list in R

You can get the std error, p-values, etc. with the following modifications:

condlm <- function(i){    
if(sum(is.na(df2012[,i]))==dim(df2013)[1]) # ignore the columns only containing NA's
return()
else
lm.model <- lm(df2013[,i]~df2012[,i])
summary(lm.model)
}


lms <- lapply(1:dim(df2013)[2], condlm)
lms

However please note that due to the way that your data is currently structured in your example, you do not have sufficient data to obtain numeric values for std. error, etc. since you are under-fitting your model.

For example, with your sample data we will get the following (partial output)

> lms
[[1]]
NULL

[[2]]

Call:
lm(formula = df2013[, i] ~ df2012[, i])

Residuals:
ALL 2 residuals are 0: no residual degrees of freedom!

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 16.5455 NA NA NA
df2012[, i] 0.1818 NA NA NA

Residual standard error: NaN on 0 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: NaN
F-statistic: NaN on 1 and 0 DF, p-value: NA

Extract only variables and coefficients with Signif. less 0.05 in R

I think broom would make it easier:

library(tidyverse)
fit <- lm(mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am,
data = mtcars
)
coef <- broom::tidy(fit)
coef %>% filter(p.value < 0.05)

# or

subset(coef, coef$p.value < 0.05)

Using python to extract regression coefficients

Doe without code, its hard to say why you are getting the behaviour you are seeing?

Here's a sample complete code that works.

import numpy as np
import pandas as pd

import statsmodels.api as sm
import statsmodels.formula.api as smf


df = pd.DataFrame(np.random.randint(100, size=(50,2)))
df.rename(columns={0:'X1', 1:'X2'}, inplace=True)

# GLM Model

model = smf.glm("X2 ~ X1", data=df, family= sm.families.Poisson()).fit()

print(model.summary())
print(model.params)


# Poisson Model

poisson = smf.poisson("X2 ~ X1", data=df).fit()
print (poisson.summary())
print (poisson.params)

Extract lists of p-values for each regression coefficients (1104 linear regressions) with R

Here's a tidyverse solution in multiple parts, hopefully easier to read that way :-) I used mtcars as a play dataset with mpg as the invariant independent variable

library(dplyr)
library(purrr)
library(broom)
library(tibble)

# first key change is let `broom::tidy` do the hard work

test2 <- lapply(2:10, function(i) broom::tidy(lm(mtcars[,i] ~ mtcars[,"mpg"])))
names(test2) <- names(mtcars[2:10])
basic_information <-
map2_df(test2,
names(test2),
~ mutate(.x, which_dependent = .y)) %>%
mutate(term = ifelse(term == "(Intercept)", "Intercept", "mpg")) %>%
select(which_dependent, everything())

basic_information
#> # A tibble: 18 x 6
#> which_dependent term estimate std.error statistic p.value
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 cyl Intercept 11.3 0.593 19.0 2.87e-18
#> 2 cyl mpg -0.253 0.0283 -8.92 6.11e-10
#> 3 disp Intercept 581. 41.7 13.9 1.26e-14
#> 4 disp mpg -17.4 1.99 -8.75 9.38e-10
#> 5 hp Intercept 324. 27.4 11.8 8.25e-13
#> 6 hp mpg -8.83 1.31 -6.74 1.79e- 7
#> 7 drat Intercept 2.38 0.248 9.59 1.20e-10
#> 8 drat mpg 0.0604 0.0119 5.10 1.78e- 5
#> 9 wt Intercept 6.05 0.309 19.6 1.20e-18
#> 10 wt mpg -0.141 0.0147 -9.56 1.29e-10
#> 11 qsec Intercept 15.4 1.03 14.9 2.05e-15
#> 12 qsec mpg 0.124 0.0492 2.53 1.71e- 2
#> 13 vs Intercept -0.678 0.239 -2.84 8.11e- 3
#> 14 vs mpg 0.0555 0.0114 4.86 3.42e- 5
#> 15 am Intercept -0.591 0.253 -2.33 2.64e- 2
#> 16 am mpg 0.0497 0.0121 4.11 2.85e- 4
#> 17 gear Intercept 2.51 0.411 6.10 1.05e- 6
#> 18 gear mpg 0.0588 0.0196 3.00 5.40e- 3

Just to change things up a bit... we'll use map to construct formula

y <- 'mpg'
x <- names(mtcars[2:10])

models <- map(setNames(x, x),
~ lm(as.formula(paste(.x, y, sep="~")),
data=mtcars))

pvalues <-
data.frame(rsquared = unlist(map(models, ~ summary(.)$r.squared)),
RSE = unlist(map(models, ~ summary(.)$sigma))) %>%
rownames_to_column(var = "which_dependent")

results <- full_join(basic_information, pvalues)

#> Joining, by = "which_dependent"
results
# A tibble: 18 x 8
which_dependent term estimate std.error statistic p.value rsquared RSE
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 cyl Intercept 11.3 0.593 19.0 2.87e-18 0.726 0.950
2 cyl mpg -0.253 0.0283 -8.92 6.11e-10 0.726 0.950
3 disp Intercept 581. 41.7 13.9 1.26e-14 0.718 66.9
4 disp mpg -17.4 1.99 -8.75 9.38e-10 0.718 66.9
5 hp Intercept 324. 27.4 11.8 8.25e-13 0.602 43.9
6 hp mpg -8.83 1.31 -6.74 1.79e- 7 0.602 43.9
7 drat Intercept 2.38 0.248 9.59 1.20e-10 0.464 0.398
8 drat mpg 0.0604 0.0119 5.10 1.78e- 5 0.464 0.398
9 wt Intercept 6.05 0.309 19.6 1.20e-18 0.753 0.494
10 wt mpg -0.141 0.0147 -9.56 1.29e-10 0.753 0.494
11 qsec Intercept 15.4 1.03 14.9 2.05e-15 0.175 1.65
12 qsec mpg 0.124 0.0492 2.53 1.71e- 2 0.175 1.65
13 vs Intercept -0.678 0.239 -2.84 8.11e- 3 0.441 0.383
14 vs mpg 0.0555 0.0114 4.86 3.42e- 5 0.441 0.383
15 am Intercept -0.591 0.253 -2.33 2.64e- 2 0.360 0.406
16 am mpg 0.0497 0.0121 4.11 2.85e- 4 0.360 0.406
17 gear Intercept 2.51 0.411 6.10 1.05e- 6 0.231 0.658
18 gear mpg 0.0588 0.0196 3.00 5.40e- 3 0.231 0.658


Related Topics



Leave a reply



Submit