Constrained Optimization in R

constrained optimization in R

Setting up the function was trivial:

fr <- function(x) {      x1 <- x[1]
    x2 <- x[2]
    -(log(x1) + x1^2/x2^2)  # need negative since constrOptim is a minimization routine
}

Setting up the constraint matrix was problematic due to a lack of much documentation, and I resorted to experimentation. The help page says "The feasible region is defined by ui %*% theta - ci >= 0". So I tested and this seemed to "work":

> rbind(c(-1,-1),c(1,0), c(0,1) ) %*% c(0.99,0.001) -c(-1,0, 0)
      [,1]
[1,] 0.009
[2,] 0.990
[3,] 0.001

So I put in a row for each constraint/boundary:

constrOptim(c(0.99,0.001), fr, NULL, ui=rbind(c(-1,-1),  # the -x-y > -1
                                              c(1,0),    # the x > 0
                                              c(0,1) ),  # the y > 0
                                           ci=c(-1,0, 0)) # the thresholds

For this problem there is a potential difficulty in that for all values of x the function goes to Inf as y -> 0. I do get a max around x=.95 and y=0 even when I push the starting values out to the "corner", but I'm somewhat suspicious that this is not the true maximum which I would have guessed was in the "corner".
EDIT:
Pursuing this I reasoned that the gradient might provide additional "direction" and added a gradient function:

grr <- function(x) { ## Gradient of 'fr'
    x1 <- x[1]
    x2 <- x[2]
    c(-(1/x[1] + 2 * x[1]/x[2]^2),
       2 * x[1]^2 /x[2]^3 )
}

This did "steer" the optimization a bit closer to the c(.999..., 0) corner, instead of moving away from it, as it did for some starting values. I remain somewhat disappointed that the process seems to "head for the cliff" when the starting values are close to the center of the feasible region:

 constrOptim(c(0.99,0.001), fr, grr, ui=rbind(c(-1,-1),  # the -x-y > -1
                                               c(1,0),    # the x > 0
                                               c(0,1) ),  # the y > 0
                                            ci=c(-1,0, 0) )
$par
[1]  9.900007e-01 -3.542673e-16

$value
[1] -7.80924e+30

$counts
function gradient 
    2001       37 

$convergence
[1] 11

$message
[1] "Objective function increased at outer iteration 2"

$outer.iterations
[1] 2

$barrier.value
[1] NaN

Note: Hans Werner Borchers posted a better example on R-Help that succeeded in getting the corner values by setting the constraint slightly away from the edge:

> constrOptim(c(0.25,0.25), fr, NULL, 
              ui=rbind( c(-1,-1), c(1,0),   c(0,1) ),  
              ci=c(-1, 0.0001, 0.0001)) 
$par
[1] 0.9999 0.0001

Optimization in R with constraint on the sum and type of optimization parameters

1) Define a function proj such that for any input vector x
the output vector y satisfies sum(y) = k. Then we have the following.

Note that this is a relaxation of the original problem where we have not applied the integer constraint; however, if the relaxed problem satisfies the constraint then it must be the solution to the original problem as well.

proj <- function(x, k = 3) k * x / sum(x)
obj <- function(x, ...) my_function(proj(x), ...)
out <- nlminb(c(1, 1, 1), obj, q = q, m = m, lower = 0)
str(out)
## List of 6
##  $ par        : num [1:3] 0 0 5.05
##  $ objective  : num -12.1
##  $ convergence: int 0
##  $ iterations : int 4
##  $ evaluations: Named int [1:2] 5 12
##   ..- attr(*, "names")= chr [1:2] "function" "gradient"
##  $ message    : chr "both X-convergence and relative convergence (5)"

proj(out$par) # solution
## [1] 0 0 3

2) Another approach is to use integer programming. This one does explicitly impose the integer constraint.

library(lpSolve)
res <- lp("min", q-m, t(rep(1, 3)), "=", 3, all.int = TRUE)
str(res)

giving the following (res$solution is the solution).

List of 28
 $ direction       : int 0
 $ x.count         : int 3
 $ objective       : num [1:3] 0.6 -2.36 -4.02
 $ const.count     : int 1
 $ constraints     : num [1:5, 1] 1 1 1 3 3
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:5] "" "" "" "const.dir.num" ...
  .. ..$ : NULL
 $ int.count       : int 3
 $ int.vec         : int [1:3] 1 2 3
 $ bin.count       : int 0
 $ binary.vec      : int 0
 $ num.bin.solns   : int 1
 $ objval          : num -12.1
 $ solution        : num [1:3] 0 0 3
 $ presolve        : int 0
 $ compute.sens    : int 0
 $ sens.coef.from  : num 0
 $ sens.coef.to    : num 0
 $ duals           : num 0
 $ duals.from      : num 0
 $ duals.to        : num 0
 $ scale           : int 196
 $ use.dense       : int 0
 $ dense.col       : int 0
 $ dense.val       : num 0
 $ dense.const.nrow: int 0
 $ dense.ctr       : num 0
 $ use.rw          : int 0
 $ tmp             : chr "Nobody will ever look at this"
 $ status          : int 0
 - attr(*, "class")= chr "lp"

constrained optimization R: another example

constrOptim() uses linear inequality constraints and defines the feasible region by ui %*% param - ci >= 0. If the constraint is 3 * d1 + d2 <= 280, ui is c(-3, -1) and ci is -280.

constrOptim(); inequality constraint is: 3 * d1 + d2 <= 280

Fd <- function(betas) {
    b1 = betas[1]
    b2 = betas[2]
   (224 * b1 + 84 * b2 + b1 * b2 - 2 * b1^2 - b2^2)
}

theta = c(59.999,100)    # because of needing " ui %*% inital_par - ci > 0 "
ui = c(-3, -1)
ci = -280                # those ui & ci mean " -3*par[1] + -1*par[2] + 280 >= 0 "

constrOptim(theta, Fd, NULL, ui = ui, ci = ci, control=list(fnscale=-1))
  # $par
  # [1] 69.00002 72.99993

[Edited]

If you want not inequality but equality constraints, it would be better to use Rsolnp or alabama package. They can use inequality and/or equality constraints (see Constrained Optimization library for equality and inequality constraints).

solnp(); auglag(); equality constraint is: 3 * d1 + d2 = 280

library(Rsolnp); library(alabama); 

Fd2 <- function(betas) {     #  -1 * Fd
   b1 = betas[1]
   b2 = betas[2]
   -1 * (224 * b1 + 84 * b2 + b1 * b2 - 2 * b1^2 - b2^2)
}

eqFd <- function(betas) {  # the equality constraint
    b1 = betas[1]
    b2 = betas[2]
    (3 * b1 + b2 -280)
}

solnp(pars = c(60, 100), fun = Fd2, eqfun = eqFd, eqB = 0)
auglag(par = c(60, 100), fn = Fd2, heq = eqFd)

R optimization with equality and inequality constraints

On this occasion optim will not work obviously because you have equality constraints. constrOptim will not work either for the same reason (I tried converting the equality to two inequalities i.e. greater and less than 15 but this didn't work with constrOptim).

However, there is a package dedicated to this kind of problem and that is Rsolnp.

You use it the following way:

#specify your function
opt_func <- function(x) {
  10 - 5*x[1] + 2 * x[2] - x[3]
}

#specify the equality function. The number 15 (to which the function is equal)
#is specified as an additional argument
equal <- function(x) {
  x[1] + x[2] + x[3] 
}

#the optimiser - minimises by default
solnp(c(5,5,5), #starting values (random - obviously need to be positive and sum to 15)
      opt_func, #function to optimise
      eqfun=equal, #equality function 
      eqB=15,   #the equality constraint
      LB=c(0,0,0), #lower bound for parameters i.e. greater than zero
      UB=c(100,100,100)) #upper bound for parameters (I just chose 100 randomly)

Output:

> solnp(c(5,5,5),
+       opt_func,
+       eqfun=equal,
+       eqB=15,
+       LB=c(0,0,0),
+       UB=c(100,100,100))

Iter: 1 fn: -65.0000     Pars:  14.99999993134  0.00000002235  0.00000004632
Iter: 2 fn: -65.0000     Pars:  14.999999973563  0.000000005745  0.000000020692
solnp--> Completed in 2 iterations
$pars
[1] 1.500000e+01 5.745236e-09 2.069192e-08

$convergence
[1] 0

$values
[1] -10 -65 -65

$lagrange
     [,1]
[1,]   -5

$hessian
          [,1]      [,2]      [,3]
[1,] 121313076 121313076 121313076
[2,] 121313076 121313076 121313076
[3,] 121313076 121313076 121313076

$ineqx0
NULL

$nfuneval
[1] 126

$outer.iter
[1] 2

$elapsed
Time difference of 0.1770101 secs

$vscale
[1] 6.5e+01 1.0e-08 1.0e+00 1.0e+00 1.0e+00

So the resulting optimal values are:

$pars
[1] 1.500000e+01 5.745236e-09 2.069192e-08

which means that the first parameter is 15 and the rest zero and zero. This is indeed the global minimum in your function since the x2 is adding to the function and 5 * x1 has a much greater (negative) influence than x3 on the outcome. The choice of 15, 0, 0 is the solution and the global minimum to the function according to the constraints.

The function worked great!

Nonlinear constrained optimization with optimx

1) We can incorporate the constraints within the objective function by returning a large number if any constraint is violated.

For most methods (but not Nelder Mead) the requirement is that the objective function be continuous and differentiable and requires a starting value in the interior of the feasible region, not the boundary. These requirements are not satisfied for f below but we will try it anyways.

library(optimx)

f <- function(z, x = z[1], y = z[2]) {
  if (2*x^2 + 3*y^2 <= 100 && x<=3 && -x<=0 && -y<=-3) 2*x+3*y else 1e10
}

optimx(c(0, 3), f, method = c("Nelder", "CG", "L-BFGS-B", "spg", "nlm"))
##             p1 p2 value fevals gevals niter convcode  kkt1  kkt2 xtime
## Nelder-Mead  0  3     9    187     NA    NA        0 FALSE FALSE  0.00
## CG           0  3     9     41      1    NA        0 FALSE FALSE  0.00
## L-BFGS-B     0  3     9     21     21    NA       52 FALSE FALSE  0.00
## spg          0  3     9   1077     NA     1        0 FALSE FALSE  0.05
## nlm          0  3     9     NA     NA     1        0 FALSE FALSE  0.00

1a) This also works with optim where Nelder Mead is the default (or you could try constrOptim which explcitly supports inequality constraints).

optim(c(0, 3), f)
## $par
## [1] 0 3
## 
## $value
## [1] 9
##
## $counts
## function gradient 
##      187       NA 

$convergence
[1] 0

$message
NULL

2) Above we notice that the 2x^2 + 3y^2 <= 100 constraint is not active so we can drop it. Now since the objective function is increasing in both x and y independently it is obvious that we want to set both of them to their lower bounds so c(0, 3) is the answer.

If we want to use optimx anyways then we just use upper= and lower= arguments for those methods that use them.

f2 <- function(z, x = z[1], y = z[2]) 2*x+3*y
optimx(c(0, 3), f2, lower = c(0, 3), upper = c(3, Inf), 
  method = c("L-BFGS-B", "spg", "nlm"))
##          p1 p2 value fevals gevals niter convcode  kkt1 kkt2 xtime
## L-BFGS-B  0  3     9      1      1    NA        0 FALSE   NA  0.00
## spg       0  3     9      1     NA     0        0 FALSE   NA  0.01
## nlminb    0  3     9      1      2     1        0 FALSE   NA  0.00
## Warning message:
## In BB::spg(par = par, fn = ufn, gr = ugr, lower = lower, upper = upper,  :
##   convergence tolerance satisified at intial parameter values.

Constrained Optimization in R