Linear Regression with Constraints on The Coefficients

Linear regression with constraints on the coefficients

solve.QP can be passed arbitrary linear constraints, so it can certainly be used to model your constraints a+c >= 0 and c >= 0.

First, we can add a column of 1's to X to capture the intercept term, and then we can replicate standard linear regression with solve.QP:

X2 <- cbind(X, 1)
library(quadprog)
solve.QP(t(X2) %*% X2, t(Y) %*% X2, matrix(0, 3, 0), c())$solution
# [1]  0.08614041  0.21433372 -0.13267403

With the sample data from the question, neither constraint is met using standard linear regression.

By modifying both the Amat and bvec parameters, we can add our two constraints:

solve.QP(t(X2) %*% X2, t(Y) %*% X2, cbind(c(1, 0, 1), c(0, 0, 1)), c(0, 0))$solution
# [1] 0.0000000 0.1422207 0.0000000

Subject to these constraints, the squared residuals are minimized by setting the a and c coefficients to both equal 0.

You can handle missing values in Y or X2 just as the lm function does, by removing the offending observations. You might do something like the following as a pre-processing step:

has.missing <- rowSums(is.na(cbind(Y, X2))) > 0
Y <- Y[!has.missing]
X2 <- X2[!has.missing,]

Constrained Linear Regression in Python

Recent scipy versions include a solver:

https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.lsq_linear.html#scipy.optimize.lsq_linear

Impose Constraint on Intercept in Linear Regression Using R

A linear model.

m0 <- lm(wt ~ qsec + hp + disp, data = mtcars)
m0
# 
# Call:
# lm(formula = wt ~ qsec + hp + disp, data = mtcars)
#
# Coefficients:
# (Intercept)         qsec           hp         disp  
#   -2.450047     0.201713     0.003466     0.006755

Force the intercept to be zero.

m1 <- lm(wt ~ qsec + hp + disp - 1, data = mtcars)
m1
# 
# Call:
# lm(formula = wt ~ qsec + hp + disp - 1, data = mtcars)
#
# Coefficients:
#      qsec         hp       disp  
# 0.0842281  0.0002622  0.0072967

You can use nls to apply limits to the paramaters (in this case the lower limit).

m1n <- nls(wt ~ a + b1 * qsec + b2 * hp + b3 * disp, 
    data = mtcars, 
    start = list(a = 1, b1 = 1, b2 = 1, b3 = 1), 
    lower = c(0, -Inf, -Inf, -Inf), algorithm = "port")
m1n
# Nonlinear regression model
#   model: wt ~ a + b1 * qsec + b2 * hp + b3 * disp
#    data: mtcars
#         a        b1        b2        b3 
# 0.0000000 0.0842281 0.0002622 0.0072967 
#  residual sum-of-squares: 4.926
#
# Algorithm "port", convergence message: relative convergence (4)

See here for other example solutions.

Simple linear regression with constraint

This works nicely,

from scipy.optimize import minimize

# Define the Model
model = lambda b, X: b[0] * X[:,0] + b[1] * X[:,1] + b[2] * X[:,2]

# The objective Function to minimize (least-squares regression)
obj = lambda b, Y, X: np.sum(np.abs(Y-model(b, X))**2)

# Bounds: b[0], b[1], b[2] >= 0
bnds = [(0, None), (0, None), (0, None)]

# Constraint: b[0] + b[1] + b[2] - 1 = 0
cons = [{"type": "eq", "fun": lambda b: b[0]+b[1]+b[2] - 1}]

# Initial guess for b[1], b[2], b[3]:
xinit = np.array([0, 0, 1])

res = minimize(obj, args=(Y, X), x0=xinit, bounds=bnds, constraints=cons)

print(f"b1={res.x[0]}, b2={res.x[1]}, b3={res.x[2]}")

#Save the coefficients for further analysis on goodness of fit

beta1 = res.x[0]

beta2 = res.x[1]

beta3 = res.x[2]

Linear Regression with Constraints on The Coefficients