Augmented Dynamic Adaptive Model

Ivan Svetunkov

2021-04-15

This vignette explains briefly how to use the function adam() and the related auto.adam() in smooth package. It does not aim at covering all aspects of the function, but focuses on the main ones.

ADAM is Augmented Dynamic Adaptive Model. It is a model that underlies ETS, ARIMA and regression, connecting them in a unified framework. The underlying model for ADAM is a Single Source of Error state space model, which is explained in detail separately in an online textbook.

The main philosophy of adam() function is to be agnostic of the provided data. This means that it will work with ts, msts, zoo, xts, data.frame, numeric and other classes of data. The specification of seasonality in the model is done using a separate parameter lags, so you are not obliged to transform the existing data to something specific, and can use it as is. If you provide a matrix, or a data.frame, or a data.table, or any other multivariate structure, then the function will use the first column for the response variable and the others for the explanatory ones. One thing that is currently assumed in the function is that the data is measured at a regular frequency. If this is not the case, you will need to introduce missing values manually.

In order to run the experiments in this vignette, we need to load the following packages:

require(Mcomp)
require(greybox)
require(smooth)

ADAM ETS

First and foremost, ADAM implements ETS model, although in a more flexible way than (Hyndman et al. 2008): it supports different distributions for the error term, which are regulated via distribution parameter. By default, the additive error model relies on Normal distribution, while the multiplicative error one assumes Inverse Gaussian. If you want to reproduce the classical ETS, you would need to specify distribution="dnorm". Here is an example of ADAM ETS(MMM) with Normal distribution on a N2568 data from M3 competition (if you provide an Mcomp object, adam() will automatically set the train and test sets, the forecast horizon and even the needed lags):

testModel <- adam(M3[[2568]], "MMM", lags=c(1,12), distribution="dnorm")
summary(testModel)
#> 
#> Model estimated using adam() function: ETS(MMM)
#> Response variable: M3..2568..
#> Distribution used in the estimation: Normal
#> Loss function type: likelihood; Loss function value: 869.8367
#> Coefficients:
#>              Estimate Std. Error Lower 2.5% Upper 97.5%  
#> alpha          0.1092     0.0349     0.0400      0.1782 *
#> beta           0.0288     0.0198     0.0000      0.0680  
#> gamma          0.0022     0.0546     0.0000      0.1102  
#> level       4587.3904   175.1692  4239.8168   4933.8682 *
#> trend          1.0038     0.0019     1.0001      1.0075 *
#> seasonal_1     1.1785     0.0204     1.1526      1.2301 *
#> seasonal_2     0.8163     0.0143     0.7904      0.8679 *
#> seasonal_3     0.8234     0.0144     0.7975      0.8750 *
#> seasonal_4     1.5721     0.0261     1.5461      1.6237 *
#> seasonal_5     0.7448     0.0131     0.7189      0.7964 *
#> seasonal_6     1.2687     0.0219     1.2428      1.3203 *
#> seasonal_7     0.8923     0.0153     0.8664      0.9439 *
#> seasonal_8     0.9121     0.0160     0.8862      0.9637 *
#> seasonal_9     1.2291     0.0225     1.2032      1.2807 *
#> seasonal_10    0.8835     0.0163     0.8575      0.9351 *
#> seasonal_11    0.8383     0.0155     0.8124      0.8899 *
#> 
#> Sample size: 116
#> Number of estimated parameters: 17
#> Number of degrees of freedom: 99
#> Information criteria:
#>      AIC     AICc      BIC     BICc 
#> 1773.674 1779.918 1820.485 1835.327
plot(forecast(testModel,h=18,interval="prediction"))

You might notice that the summary contains more than what is reported by other smooth functions. This one also produces standard errors for the estimated parameters based on Fisher Information calculation. Note that this is computationally expensive, so if you have a model with more than 30 variables, the calculation of standard errors might take plenty of time. As for the default print() method, it will produce a shorter summary from the model, without the standard errors (similar to what es() does):

testModel
#> Time elapsed: 0.12 seconds
#> Model estimated using adam() function: ETS(MMM)
#> Distribution assumed in the model: Normal
#> Loss function type: likelihood; Loss function value: 869.8367
#> Persistence vector g:
#>  alpha   beta  gamma 
#> 0.1092 0.0288 0.0022 
#> 
#> Sample size: 116
#> Number of estimated parameters: 17
#> Number of degrees of freedom: 99
#> Information criteria:
#>      AIC     AICc      BIC     BICc 
#> 1773.674 1779.918 1820.485 1835.327 
#> 
#> Forecast errors:
#> ME: 586.333; MAE: 797.299; RMSE: 995.959
#> sCE: 144.981%; sMAE: 10.953%; sMSE: 1.872%
#> MASE: 0.325; RMSSE: 0.314; rMAE: 0.352; rRMSE: 0.328

Also, note that the prediction interval in case of multiplicative error models are approximate. It is advisable to use simulations instead (which is slower, but more accurate):

plot(forecast(testModel,h=18,interval="simulated"))

If you want to do the residuals diagnostics, then it is recommended to use plot function, something like this (you can select, which of the plots to produce):

par(mfcol=c(3,4))
plot(testModel,which=c(1:11))
par(mfcol=c(1,1))
plot(testModel,which=12)

By default ADAM will estimate models via maximising likelihood function. But there is also a parameter loss, which allows selecting from a list of already implemented loss functions (again, see documentation for adam() for the full list) or using a function written by a user. Here is how to do the latter on the example of another M3 series:

lossFunction <- function(actual, fitted, B){
  return(sum(abs(actual-fitted)^3))
}
testModel <- adam(M3[[1234]], "AAN", silent=FALSE, loss=lossFunction)
testModel
#> Time elapsed: 0.02 seconds
#> Model estimated using adam() function: ETS(AAN)
#> Distribution assumed in the model: Normal
#> Loss function type: custom; Loss function value: 23993012
#> Persistence vector g:
#>  alpha   beta 
#> 0.6316 0.2494 
#> 
#> Sample size: 45
#> Number of estimated parameters: 4
#> Number of degrees of freedom: 41
#> Information criteria are unavailable for the chosen loss & distribution.
#> 
#> Forecast errors:
#> ME: -346.9; MAE: 346.9; RMSE: 395.39
#> sCE: -34.086%; sMAE: 4.261%; sMSE: 0.236%
#> MASE: 4.8; RMSSE: 4.416; rMAE: 3.942; rRMSE: 3.567

Note that you need to have parameters actual, fitted and B in the function, which correspond to the vector of actual values, vector of fitted values on each iteration and a vector of the optimised parameters.

loss and distribution parameters are independent, so in the example above, we have assumed that the error term follows Normal distribution, but we have estimated its parameters using a non-conventional loss because we can. Some of distributions assume that there is an additional parameter, which can either be estimated or provided by user. These include Asymmetric Laplace (distribution="dalaplace") with alpha, Generalised Normal and Log Generalised normal (distribution=c("gnorm","dlgnorm")) with shape and Student’s T (distribution="dt") with nu:

testModel <- adam(M3[[1234]], "MMN", silent=FALSE, distribution="dgnorm", shape=3)

The model selection in ADAM ETS relies on information criteria and works correctly only for the loss="likelihood". There are several options, how to select the model, see them in the description of the function: ?adam(). The default one uses branch-and-bound algorithm, similar to the one used in es(), but only considers additive trend models (the multiplicative trend ones are less stable and need more attention from a forecaster):

testModel <- adam(M3[[2568]], "ZXZ", lags=c(1,12), silent=FALSE)
#> Forming the pool of models based on... ANN , ANA , MNM , MAM , Estimation progress:    71 %86 %100 %... Done!
testModel
#> Time elapsed: 0.51 seconds
#> Model estimated using adam() function: ETS(MAM)
#> Distribution assumed in the model: Inverse Gaussian
#> Loss function type: likelihood; Loss function value: 866.1911
#> Persistence vector g:
#> alpha  beta gamma 
#> 0.089 0.010 0.000 
#> 
#> Sample size: 116
#> Number of estimated parameters: 17
#> Number of degrees of freedom: 99
#> Information criteria:
#>      AIC     AICc      BIC     BICc 
#> 1766.382 1772.627 1813.193 1828.036 
#> 
#> Forecast errors:
#> ME: 721.028; MAE: 854.919; RMSE: 1093.439
#> sCE: 178.287%; sMAE: 11.744%; sMSE: 2.256%
#> MASE: 0.348; RMSSE: 0.345; rMAE: 0.377; rRMSE: 0.36

Note that the function produces point forecasts if h>0, but it won’t generate prediction interval. This is why you need to use forecast() method (as shown in the first example in this vignette).

Similarly to es(), function supports combination of models, but it saves all the tested models in the output for a potential reuse. Here how it works:

testModel <- adam(M3[[2568]], "CXC", lags=c(1,12))
testForecast <- forecast(testModel,h=18,interval="semiparametric", level=c(0.9,0.95))
testForecast
#>          Point forecast Lower bound (5%) Lower bound (2.5%) Upper bound (95%)
#> Sep 1992      10876.730         9293.571           9024.007          12612.51
#> Oct 1992       7802.241         2561.501           1725.632          13924.59
#> Nov 1992       7413.786         2344.140           1527.390          13289.66
#> Dec 1992      10154.774         4284.003           3360.298          17067.69
#> Jan 1993      10478.467         4591.845           3662.168          17385.83
#> Feb 1993       7229.466         2497.431           1715.201          12594.62
#> Mar 1993       7388.296         2744.065           1971.807          12625.43
#> Apr 1993      13906.034         7742.906           6756.561          21037.45
#> May 1993       6602.249         2667.946           1993.692          10926.78
#> Jun 1993      11362.923         6626.879           5836.821          16667.13
#> Jul 1993       7973.517         4535.692           3941.214          11716.26
#> Aug 1993       8158.772         5425.978           4948.087          11103.85
#> Sep 1993      11049.915         9386.483           9104.914          12881.92
#> Oct 1993       7925.625         2427.700           1542.051          14300.94
#> Nov 1993       7530.313         2192.050           1324.967          13680.45
#> Dec 1993      10314.034         4169.417           3194.764          17508.41
#> Jan 1994      10641.652         4476.378           3495.371          17838.43
#> Feb 1994       7340.869         2337.805           1505.908          12989.27
#>          Upper bound (97.5%)
#> Sep 1992            12983.72
#> Oct 1992            15348.76
#> Nov 1992            14644.33
#> Dec 1992            18686.60
#> Jan 1993            18996.32
#> Feb 1993            13799.01
#> Mar 1993            13792.97
#> Apr 1993            22669.18
#> May 1993            11861.08
#> Jun 1993            17835.74
#> Jul 1993            12513.86
#> Aug 1993            11723.14
#> Sep 1993            13275.77
#> Oct 1993            15772.03
#> Nov 1993            15089.21
#> Dec 1993            19183.34
#> Jan 1994            19507.44
#> Feb 1994            14251.48
plot(testForecast)

Yes, now we support vectors for the levels in case you want to produce several. In fact, we also support side for prediction interval, so you can extract specific quantiles without a hustle:

forecast(testModel,h=18,interval="semiparametric", level=c(0.9,0.95,0.99), side="upper")
#>          Point forecast Upper bound (90%) Upper bound (95%) Upper bound (99%)
#> Sep 1992      10876.730         12196.889          12612.51          13428.15
#> Oct 1992       7802.241         12368.966          13924.59          17096.65
#> Nov 1992       7413.786         11806.352          13289.66          16303.05
#> Dec 1992      10154.774         15301.952          17067.69          20676.09
#> Jan 1993      10478.467         15627.027          17385.83          20973.01
#> Feb 1993       7229.466         11265.968          12594.62          15262.97
#> Mar 1993       7388.296         11334.873          12625.43          15209.31
#> Apr 1993      13906.034         19245.429          21037.45          24660.79
#> May 1993       6602.249          9884.937          10926.78          12984.68
#> Jun 1993      11362.923         15370.378          16667.13          19247.72
#> Jul 1993       7973.517         10823.159          11716.26          13469.01
#> Aug 1993       8158.772         10407.742          11103.85          12461.87
#> Sep 1993      11049.915         12441.586          12881.92          13747.96
#> Oct 1993       7925.625         12690.616          14300.94          17573.84
#> Nov 1993       7530.313         12135.296          13680.45          16811.38
#> Dec 1993      10314.034         15678.788          17508.41          21238.72
#> Jan 1994      10641.652         16013.222          17838.43          21553.28
#> Feb 1994       7340.869         11595.189          12989.27          15783.99

A brand new thing in the function is the possibility to use several frequencies (double / triple / quadruple / … seasonal models). In order to show how it works, we will generate an artificial time series, inspired by half-hourly electricity demand using sim.gum() function:

ordersGUM <- c(1,1,1)
lagsGUM <- c(1,48,336)
initialGUM1 <- -25381.7
initialGUM2 <- c(23955.09, 24248.75, 24848.54, 25012.63, 24634.14, 24548.22, 24544.63, 24572.77,
                 24498.33, 24250.94, 24545.44, 25005.92, 26164.65, 27038.55, 28262.16, 28619.83,
                 28892.19, 28575.07, 28837.87, 28695.12, 28623.02, 28679.42, 28682.16, 28683.40,
                 28647.97, 28374.42, 28261.56, 28199.69, 28341.69, 28314.12, 28252.46, 28491.20,
                 28647.98, 28761.28, 28560.11, 28059.95, 27719.22, 27530.23, 27315.47, 27028.83,
                 26933.75, 26961.91, 27372.44, 27362.18, 27271.31, 26365.97, 25570.88, 25058.01)
initialGUM3 <- c(23920.16, 23026.43, 22812.23, 23169.52, 23332.56, 23129.27, 22941.20, 22692.40,
                 22607.53, 22427.79, 22227.64, 22580.72, 23871.99, 25758.34, 28092.21, 30220.46,
                 31786.51, 32699.80, 33225.72, 33788.82, 33892.25, 34112.97, 34231.06, 34449.53,
                 34423.61, 34333.93, 34085.28, 33948.46, 33791.81, 33736.17, 33536.61, 33633.48,
                 33798.09, 33918.13, 33871.41, 33403.75, 32706.46, 31929.96, 31400.48, 30798.24,
                 29958.04, 30020.36, 29822.62, 30414.88, 30100.74, 29833.49, 28302.29, 26906.72,
                 26378.64, 25382.11, 25108.30, 25407.07, 25469.06, 25291.89, 25054.11, 24802.21,
                 24681.89, 24366.97, 24134.74, 24304.08, 25253.99, 26950.23, 29080.48, 31076.33,
                 32453.20, 33232.81, 33661.61, 33991.21, 34017.02, 34164.47, 34398.01, 34655.21,
                 34746.83, 34596.60, 34396.54, 34236.31, 34153.32, 34102.62, 33970.92, 34016.13,
                 34237.27, 34430.08, 34379.39, 33944.06, 33154.67, 32418.62, 31781.90, 31208.69,
                 30662.59, 30230.67, 30062.80, 30421.11, 30710.54, 30239.27, 28949.56, 27506.96,
                 26891.75, 25946.24, 25599.88, 25921.47, 26023.51, 25826.29, 25548.72, 25405.78,
                 25210.45, 25046.38, 24759.76, 24957.54, 25815.10, 27568.98, 29765.24, 31728.25,
                 32987.51, 33633.74, 34021.09, 34407.19, 34464.65, 34540.67, 34644.56, 34756.59,
                 34743.81, 34630.05, 34506.39, 34319.61, 34110.96, 33961.19, 33876.04, 33969.95,
                 34220.96, 34444.66, 34474.57, 34018.83, 33307.40, 32718.90, 32115.27, 31663.53,
                 30903.82, 31013.83, 31025.04, 31106.81, 30681.74, 30245.70, 29055.49, 27582.68,
                 26974.67, 25993.83, 25701.93, 25940.87, 26098.63, 25771.85, 25468.41, 25315.74,
                 25131.87, 24913.15, 24641.53, 24807.15, 25760.85, 27386.39, 29570.03, 31634.00,
                 32911.26, 33603.94, 34020.90, 34297.65, 34308.37, 34504.71, 34586.78, 34725.81,
                 34765.47, 34619.92, 34478.54, 34285.00, 34071.90, 33986.48, 33756.85, 33799.37,
                 33987.95, 34047.32, 33924.48, 33580.82, 32905.87, 32293.86, 31670.02, 31092.57,
                 30639.73, 30245.42, 30281.61, 30484.33, 30349.51, 29889.23, 28570.31, 27185.55,
                 26521.85, 25543.84, 25187.82, 25371.59, 25410.07, 25077.67, 24741.93, 24554.62,
                 24427.19, 24127.21, 23887.55, 24028.40, 24981.34, 26652.32, 28808.00, 30847.09,
                 32304.13, 33059.02, 33562.51, 33878.96, 33976.68, 34172.61, 34274.50, 34328.71,
                 34370.12, 34095.69, 33797.46, 33522.96, 33169.94, 32883.32, 32586.24, 32380.84,
                 32425.30, 32532.69, 32444.24, 32132.49, 31582.39, 30926.58, 30347.73, 29518.04,
                 29070.95, 28586.20, 28416.94, 28598.76, 28529.75, 28424.68, 27588.76, 26604.13,
                 26101.63, 25003.82, 24576.66, 24634.66, 24586.21, 24224.92, 23858.42, 23577.32,
                 23272.28, 22772.00, 22215.13, 21987.29, 21948.95, 22310.79, 22853.79, 24226.06,
                 25772.55, 27266.27, 28045.65, 28606.14, 28793.51, 28755.83, 28613.74, 28376.47,
                 27900.76, 27682.75, 27089.10, 26481.80, 26062.94, 25717.46, 25500.27, 25171.05,
                 25223.12, 25634.63, 26306.31, 26822.46, 26787.57, 26571.18, 26405.21, 26148.41,
                 25704.47, 25473.10, 25265.97, 26006.94, 26408.68, 26592.04, 26224.64, 25407.27,
                 25090.35, 23930.21, 23534.13, 23585.75, 23556.93, 23230.25, 22880.24, 22525.52,
                 22236.71, 21715.08, 21051.17, 20689.40, 20099.18, 19939.71, 19722.69, 20421.58,
                 21542.03, 22962.69, 23848.69, 24958.84, 25938.72, 26316.56, 26742.61, 26990.79,
                 27116.94, 27168.78, 26464.41, 25703.23, 25103.56, 24891.27, 24715.27, 24436.51,
                 24327.31, 24473.02, 24893.89, 25304.13, 25591.77, 25653.00, 25897.55, 25859.32,
                 25918.32, 25984.63, 26232.01, 26810.86, 27209.70, 26863.50, 25734.54, 24456.96)
y <- sim.gum(orders=ordersGUM, lags=lagsGUM, nsim=1, frequency=336, obs=3360,
             measurement=rep(1,3), transition=diag(3), persistence=c(0.045,0.162,0.375),
             initial=cbind(initialGUM1,initialGUM2,initialGUM3))$data

We can then apply ADAM to this data:

testModel <- adam(y, "MMdM", lags=c(1,48,336), initial="backcasting",
                  silent=FALSE, h=336, holdout=TRUE)
testModel
#> Time elapsed: 0.6 seconds
#> Model estimated using adam() function: ETS(MMdM)[48, 336]
#> Distribution assumed in the model: Inverse Gaussian
#> Loss function type: likelihood; Loss function value: 21730.11
#> Persistence vector g:
#>  alpha   beta gamma1 gamma2 
#> 0.9283 0.0022 0.0712 0.0659 
#> Damping parameter: 0.7047
#> Sample size: 3024
#> Number of estimated parameters: 6
#> Number of degrees of freedom: 3018
#> Information criteria:
#>      AIC     AICc      BIC     BICc 
#> 43472.22 43472.25 43508.31 43508.42 
#> 
#> Forecast errors:
#> ME: -497.787; MAE: 805.176; RMSE: 1018.342
#> sCE: -554.379%; sMAE: 2.669%; sMSE: 0.114%
#> MASE: 1.073; RMSSE: 0.993; rMAE: 0.129; rRMSE: 0.133

Note that the more lags you have, the more initial seasonal components the function will need to estimate, which is a difficult task. This is why we used initial="backcasting" in the example above - this speeds up the estimation by reducing the number of parameters to estimate. Still, the optimiser might not get close to the optimal value, so we can help it. First, we can give more time for the calculation, increasing the number of iterations via maxeval (the default value is 40 iterations for each estimated parameter, e.g. \(40 \times 5 = 200\) in our case):

testModel <- adam(y, "MMdM", lags=c(1,48,336), initial="backcasting",
                  silent=FALSE, h=336, holdout=TRUE, maxeval=10000)
testModel
#> Time elapsed: 1.77 seconds
#> Model estimated using adam() function: ETS(MMdM)[48, 336]
#> Distribution assumed in the model: Inverse Gaussian
#> Loss function type: likelihood; Loss function value: 19593.19
#> Persistence vector g:
#>  alpha   beta gamma1 gamma2 
#> 0.0257 0.0004 0.1681 0.2394 
#> Damping parameter: 0
#> Sample size: 3024
#> Number of estimated parameters: 6
#> Number of degrees of freedom: 3018
#> Information criteria:
#>      AIC     AICc      BIC     BICc 
#> 39198.37 39198.40 39234.46 39234.57 
#> 
#> Forecast errors:
#> ME: 242.661; MAE: 263.562; RMSE: 315.183
#> sCE: 270.249%; sMAE: 0.874%; sMSE: 0.011%
#> MASE: 0.351; RMSSE: 0.307; rMAE: 0.042; rRMSE: 0.041

This will take more time, but will typically lead to more refined parameters. You can control other parameters of the optimiser as well, such as algorithm, xtol_rel, print_level and others, which are explained in the documentation for nloptr function from nloptr package (run nloptr.print.options() for details). Second, we can give a different set of initial parameters for the optimiser, have a look at what the function saves:

testModel$B

and use this as a starting point for the reestimation (e.g. with a different algorithm):

testModel <- adam(y, "MMdM", lags=c(1,48,336), initial="backcasting",
                  silent=FALSE, h=336, holdout=TRUE, B=testModel$B)
testModel
#> Time elapsed: 0.59 seconds
#> Model estimated using adam() function: ETS(MMdM)[48, 336]
#> Distribution assumed in the model: Inverse Gaussian
#> Loss function type: likelihood; Loss function value: 19593.19
#> Persistence vector g:
#>  alpha   beta gamma1 gamma2 
#> 0.0257 0.0049 0.1681 0.2393 
#> Damping parameter: 0.0021
#> Sample size: 3024
#> Number of estimated parameters: 6
#> Number of degrees of freedom: 3018
#> Information criteria:
#>      AIC     AICc      BIC     BICc 
#> 39198.37 39198.40 39234.46 39234.57 
#> 
#> Forecast errors:
#> ME: 242.703; MAE: 263.593; RMSE: 315.212
#> sCE: 270.296%; sMAE: 0.874%; sMSE: 0.011%
#> MASE: 0.351; RMSSE: 0.307; rMAE: 0.042; rRMSE: 0.041

If you are ready to wait, you can change the initialisation to the initial="optimal", which in our case will take much more time because of the number of estimated parameters - 389 for the chosen model. The estimation process in this case might take 20 - 30 times more than in the example above.

In addition, you can specify some parts of the initial state vector or some parts of the persistence vector, here is an example:

testModel <- adam(y, "MMdM", lags=c(1,48,336), initial="backcasting",
                  silent=TRUE, h=336, holdout=TRUE, persistence=list(beta=0.1))
testModel
#> Time elapsed: 0.33 seconds
#> Model estimated using adam() function: ETS(MMdM)[48, 336]
#> Distribution assumed in the model: Inverse Gaussian
#> Loss function type: likelihood; Loss function value: 21923.94
#> Persistence vector g:
#>  alpha   beta gamma1 gamma2 
#> 0.9484 0.1000 0.0516 0.0516 
#> Damping parameter: 0.9439
#> Sample size: 3024
#> Number of estimated parameters: 5
#> Number of degrees of freedom: 3019
#> Number of provided parameters: 1
#> Information criteria:
#>      AIC     AICc      BIC     BICc 
#> 43857.88 43857.90 43887.95 43888.03 
#> 
#> Forecast errors:
#> ME: -1234.994; MAE: 1345.022; RMSE: 1587.704
#> sCE: -1375.397%; sMAE: 4.458%; sMSE: 0.277%
#> MASE: 1.793; RMSSE: 1.548; rMAE: 0.215; rRMSE: 0.207

The function also handles intermittent data (the data with zeroes) and the data with missing values. This is partially covered in the vignette on the oes() function. Here is a simple example:

testModel <- adam(rpois(120,0.5), "MNN", silent=FALSE, h=12, holdout=TRUE,
                  occurrence="odds-ratio")
testModel
#> Time elapsed: 0.04 seconds
#> Model estimated using adam() function: iETS(MNN)
#> Occurrence model type: Odds ratio
#> Distribution assumed in the model: Mixture of Bernoulli and Inverse Gaussian
#> Loss function type: likelihood; Loss function value: -30.2106
#> Persistence vector g:
#> alpha 
#> 1e-04 
#> 
#> Sample size: 108
#> Number of estimated parameters: 5
#> Number of degrees of freedom: 103
#> Information criteria:
#>      AIC     AICc      BIC     BICc 
#>  90.8547  91.0855 104.2654  95.4414 
#> 
#> Forecast errors:
#> Bias: -8.428%; sMSE: 23.514%; rRMSE: 0.792; sPIS: 458.632%; sCE: 33.287%

Finally, adam() is faster than es() function, because its code is more efficient and it uses a different optimisation algorithm with more finely tuned parameters by default. Let’s compare:

adamModel <- adam(M3[[2568]], "CCC")
esModel <- es(M3[[2568]], "CCC")
"adam:"
#> [1] "adam:"
adamModel
#> Time elapsed: 2.33 seconds
#> Model estimated: ETS(CCC)
#> Loss function type: likelihood
#> 
#> Number of models combined: 30
#> Sample size: 116
#> Average number of estimated parameters: 27.616
#> Average number of degrees of freedom: 88.384
#> 
#> Forecast errors:
#> ME: 680.185; MAE: 828.538; RMSE: 1058.764
#> sCE: 168.188%; sMAE: 11.382%; sMSE: 2.115%
#> MASE: 0.337; RMSSE: 0.334; rMAE: 0.366; rRMSE: 0.349
"es():"
#> [1] "es():"
esModel
#> Time elapsed: 3.77 seconds
#> Model estimated: ETS(CCC)
#> Initial values were optimised.
#> 
#> Loss function type: likelihood
#> Error standard deviation: 422.9088
#> Sample size: 116
#> Information criteria:
#> (combined values)
#>      AIC     AICc      BIC     BICc 
#>  98.7499  99.0990 101.3425 102.1491 
#> 
#> Forecast errors:
#> MPE: 4.1%; sCE: 120.9%; Bias: 60.3%; MAPE: 6.9%
#> MASE: 0.299; sMAE: 10.1%; sMSE: 1.6%; rMAE: 0.324; rRMSE: 0.301

ADAM ARIMA

As mentioned above, ADAM does not only contain ETS, it also contains ARIMA model, which is regulated via orders parameter. If you want to have a pure ARIMA, you need to switch off ETS, which is done via model="NNN":

testModel <- adam(M3[[1234]], "NNN", silent=FALSE, orders=c(0,2,2))
testModel
#> Time elapsed: 0.04 seconds
#> Model estimated using adam() function: ARIMA(0,2,2)
#> Distribution assumed in the model: Normal
#> Loss function type: likelihood; Loss function value: 255.2931
#> ARMA parameters of the model:
#> MA:
#> theta1[1] theta2[1] 
#>   -1.0912    0.3217 
#> 
#> Sample size: 45
#> Number of estimated parameters: 5
#> Number of degrees of freedom: 40
#> Information criteria:
#>      AIC     AICc      BIC     BICc 
#> 520.5863 522.1247 529.6196 532.5478 
#> 
#> Forecast errors:
#> ME: -348.365; MAE: 348.365; RMSE: 396.598
#> sCE: -34.23%; sMAE: 4.279%; sMSE: 0.237%
#> MASE: 4.82; RMSSE: 4.429; rMAE: 3.959; rRMSE: 3.578

Given that both models are implemented in the same framework, they can be compared using information criteria.

The functionality of ADAM ARIMA is similar to the one of msarima function in smooth package, although there are several differences.

First, changing the distribution parameter will allow switching between additive / multiplicative models. For example, distribution="dlnorm" will create an ARIMA, equivalent to the one on logarithms of the data:

testModel <- adam(M3[[2568]], "NNN", silent=FALSE, lags=c(1,12),
                  orders=list(ar=c(1,1),i=c(1,1),ma=c(2,2)), distribution="dlnorm")
testModel
#> Time elapsed: 0.74 seconds
#> Model estimated using adam() function: SARIMA(1,1,2)[1](1,1,2)[12]
#> Distribution assumed in the model: Log Normal
#> Loss function type: likelihood; Loss function value: 871.1066
#> ARMA parameters of the model:
#> AR:
#>  phi1[1] phi1[12] 
#>  -0.5397   0.0428 
#> MA:
#>  theta1[1]  theta2[1] theta1[12] theta2[12] 
#>    -0.3543    -0.4905    -0.4019    -0.2538 
#> 
#> Sample size: 116
#> Number of estimated parameters: 33
#> Number of degrees of freedom: 83
#> Information criteria:
#>      AIC     AICc      BIC     BICc 
#> 1808.213 1835.579 1899.082 1964.125 
#> 
#> Forecast errors:
#> ME: 235.386; MAE: 559.775; RMSE: 677.465
#> sCE: 58.203%; sMAE: 7.69%; sMSE: 0.866%
#> MASE: 0.228; RMSSE: 0.214; rMAE: 0.247; rRMSE: 0.223

Second, if you want the model with intercept / drift, you can do it using constant parameter:

testModel <- adam(M3[[2568]], "NNN", silent=FALSE, lags=c(1,12), constant=TRUE,
                  orders=list(ar=c(1,1),i=c(1,1),ma=c(2,2)), distribution="dnorm")
testModel
#> Time elapsed: 0.67 seconds
#> Model estimated using adam() function: SARIMA(1,1,2)[1](1,1,2)[12] with drift
#> Distribution assumed in the model: Normal
#> Loss function type: likelihood; Loss function value: 897.5604
#> ARMA parameters of the model:
#> AR:
#>  phi1[1] phi1[12] 
#>  -0.5192   0.0430 
#> MA:
#>  theta1[1]  theta2[1] theta1[12] theta2[12] 
#>    -0.6815    -0.0614    -0.3024     0.1389 
#> 
#> Sample size: 116
#> Number of estimated parameters: 34
#> Number of degrees of freedom: 82
#> Information criteria:
#>      AIC     AICc      BIC     BICc 
#> 1863.121 1892.503 1956.743 2026.580 
#> 
#> Forecast errors:
#> ME: 265.995; MAE: 620.494; RMSE: 715.724
#> sCE: 65.772%; sMAE: 8.524%; sMSE: 0.967%
#> MASE: 0.253; RMSSE: 0.226; rMAE: 0.274; rRMSE: 0.236