I have to consider optimization problem in simulation study. An instance is given below:
library(mvtnorm)
library(alabama)
n = 200
q = 0.5
X <- matrix(0,
This answer is an ADDENDUM to the first answer, especially targeting your second question about significantly speeding up the whole process.
To make the run time estimation reproducible, we will fix the seed; all other definitions are yours.
set.seed(4789)
n = 200
q = 0.5
X <- mvtnorm::rmvnorm(n = n, mean = c(0,0),
sigma = matrix(c(1,1,1,4), ncol = 2))
x0 = matrix(c(X[1,1:2]), nrow = 1)
y0 = x0 - 0.5 * log(n) * (colMeans(X) - x0)
X = rbind(X, y0)
x01 = y0[1]; x02 = y0[2]
x1 = X[,1]; x2 = X[,2]
pInit = matrix(rep(1/(n + 1), n + 1), nrow = n + 1)
First, let's do it with augmented Lagrangian and optim()
as inner solver.
f1 <- function(p) sum(sqrt(pmax(0, p)))
heq1 <- function(p) c(sum(x1 * p) - x01, sum(x2 * p) - x02, sum(p) - 1)
hin1 <- function(p) p - 1e-06
system.time( sol <- alabama::auglag(pInit, fn = function(p) -f1(p),
heq = heq1, hin = hin1) )
## user system elapsed
## 24.631 0.054 12.324
-1 * sol$value; heq1(sol$par)
## [1] 7.741285
## [1] 1.386921e-09 3.431108e-10 4.793488e-10
This problem is convex with linear constraints. Therefore we can apply an efficient convex solver such as ECOS. For modeling we will make use of the CVXR package.
# install.packages(c("ECOSolveR", "CVXR"))
library(CVXR)
p <- Variable(201)
obj <- Maximize(sum(sqrt(p)))
cons <- list(p >= 0, sum(p) == 1,
sum(x1*p)==x01, sum(x2*p)==x02)
prbl <- Problem(obj, cons)
system.time( sol <- solve(prbl, solver="ECOS") )
## user system elapsed
## 0.044 0.000 0.044
ps <- sol$getValue(p)
cat("The maximum value is:", sum(sqrt(pmax(0, ps))))
## The maximum value is: 7.74226
c(sum(ps), sum(x1*ps) - x01, sum(x2*ps) - x02)
## [1] 1.000000e+00 -1.018896e-11 9.167819e-12
We see that the convex solver is about 500 times faster (!) than the first approach with a standard nonlinear solver. IMPORTANT: We do not need a starting value because a convex problem has only one optimum.
As said before, see Maximizing nonlinear constraints problem using r package nloptr:
You have to prevent the solver to run into an area where your objective function is not defined, here this means p_i >= 0
for each index i
. And if it does, let the objective function return some finite value. Simplifying your function (for q = 0.5
) it looks, for instance, like
f1 <- function(p) sum(sqrt(pmax(0, p)))
Better also to provide an inequality constraint for p_i > 0
as
heq1 <- function(p) c(sum(x1 * p) - x01, sum(x2 * p) - x02, sum(p) - 1)
hin1 <- function(p) p - 1e-06
Now the solver returns a plausible result:
sol <- alabama::auglag(pInit, fn = function(p) -f1(p),
heq = heq1, hin = hin1)
-1 * sol$value
## [1] 11.47805
and the equality conditions are all satisfied:
heq1(sol$par)
## [1] -4.969690e-09 5.906888e-09 1.808652e-08
All this can be done 'programmatically', with a bit of care, naturally.