Performance

Eunseop Kim

All the tests were done on an Arch Linux x86_64 machine with an Intel(R) Core(TM) i7 CPU (1.90GHz). We first load the necessary packages.

library(melt)
library(microbenchmark)
library(ggplot2)

Empirical likelihood computation

We show the performance of computing empirical likelihood with el_mean(). We test the computation speed with simulated data sets in two different settings: 1) the number of observations increases with the number of parameters fixed, and 2) the number of parameters increases with the number of observations fixed.

Increasing the number of observations

We fix the number of parameters at \(p = 10\), and simulate the parameter value and \(n \times p\) matrices using rnorm(). In order to ensure convergence with a large \(n\), we set a large threshold value using el_control().

set.seed(3175775)
p <- 10
par <- rnorm(p, sd = 0.1)
ctrl <- el_control(th = 1e+10)
result <- microbenchmark(
  n1e2 = el_mean(matrix(rnorm(100 * p), ncol = p), par = par, control = ctrl),
  n1e3 = el_mean(matrix(rnorm(1000 * p), ncol = p), par = par, control = ctrl),
  n1e4 = el_mean(matrix(rnorm(10000 * p), ncol = p), par = par, control = ctrl),
  n1e5 = el_mean(matrix(rnorm(100000 * p), ncol = p), par = par, control = ctrl)
)

Below are the results:

result
#> Unit: microseconds
#>  expr        min         lq        mean     median          uq        max neval
#>  n1e2    496.721    600.000    751.9864    661.144    818.4215   1671.521   100
#>  n1e3   1431.147   1764.893   2199.2031   1978.675   2381.7740   3931.695   100
#>  n1e4  14558.944  19130.122  22514.8889  22234.505  24996.2615  35005.867   100
#>  n1e5 300308.387 377692.838 434658.5113 403606.535 497906.4695 792190.949   100
#>  cld
#>  a  
#>  a  
#>   b 
#>    c
autoplot(result)

Increasing the number of parameters

This time we fix the number of observations at \(n = 1000\), and evaluate empirical likelihood at zero vectors of different sizes.

n <- 1000
result2 <- microbenchmark(
  p5 = el_mean(matrix(rnorm(n * 5), ncol = 5),
    par = rep(0, 5),
    control = ctrl
  ),
  p25 = el_mean(matrix(rnorm(n * 25), ncol = 25),
    par = rep(0, 25),
    control = ctrl
  ),
  p100 = el_mean(matrix(rnorm(n * 100), ncol = 100),
    par = rep(0, 100),
    control = ctrl
  ),
  p400 = el_mean(matrix(rnorm(n * 400), ncol = 400),
    par = rep(0, 400),
    control = ctrl
  )
)
result2
#> Unit: microseconds
#>  expr        min         lq       mean     median         uq       max neval
#>    p5    745.675    887.169   1067.269    936.594   1077.803   2160.35   100
#>   p25   2956.290   3315.738   4075.956   3656.311   4361.149  13528.18   100
#>  p100  24379.049  29547.458  34768.130  33255.675  38310.068  60739.20   100
#>  p400 300339.926 369978.383 425060.884 393062.214 433277.034 863053.36   100
#>  cld
#>  a  
#>  a  
#>   b 
#>    c
autoplot(result2)

On average, evaluating empirical likelihood with a 100000×10 or 1000×400 matrix at a parameter value satisfying the convex hull constraint takes less than a second.