We currently analyze and forecast rodent data at Portal using four models:

## ESSS

ESSS (Exponential Smoothing State Space) is a flexible exponential smoothing state space model (Hyndman et al. 2008) fit to the data at the composite (full site and just control plots) spatial level and the composite (community) ecological level. The model is selected and fitted using the ets and forecast functions in the forecast package (Hyndman 2017) with the allow.multiplicative.trend argument set to TRUE and the forecast_ess function in our portalPredictions package (WeecologyLab 2018). Models fit using ets implement what is known as the “innovations” approach to state space modeling, which assumes a single source of noise that is equivalent for the process and observation errors (Hyndman et al. 2008).

In general, ESSS models are defined according to three model structure parameters: error type, trend type, and seasonality type (Hyndman et al. 2008). Each of the parameters can be an N (none), A (additive), or M (multiplicative) state (Hyndman et al. 2008). However, because of the difference in period between seasonality and sampling of the Portal rodents combined with the hard-coded single period of the ets function, we could not include the seasonal components to the ESSS model. ESSS is fit flexibly, such that the model parameters can vary from fit to fit.

## AutoARIMA

AutoArima (Automatic Auto-Regressive Integrated Moving Average) is a flexible Auto-Regressive Integrated Moving Average (ARIMA) model fit to the data at the composite (full site and just control plots) spatial level and the composite (community) ecological level. The model is selected and fitted using the auto.arima and forecast functions in the forecast package (Hyndman and Athanasopoulos 2013; Hyndman 2017) and the forecast_autoarima function in our portalPredictions package (WeecologyLab 2018).

Generally, ARIMA models are defined according to three model structure parameters: the number of autoregressive terms (p), the degree of differencing (d), and the order of the moving average (q), and are represented as ARIMA(p, d, q) (Box and Jenkins 1970). While the auto.arima function allows for seasonal models, the seasonality is hard-coded to be on the same period as the sampling, which is not the case for the Portal rodent surveys. As a result, no seasonal models were evaluated. AutoArima is fit flexibly, such that the model parameters can vary from fit to fit.

## nbGARCH

nbGARCH (Negative Binomial Auto-Regressive Conditional Heteroskedasticity) is a generalized autoregressive conditional heteroskedasticity (GARCH) model with overdispersion (i.e., a negative binomial response) fit to the data at the composite (full site and just control plots) spatial level and both the composite (community) and the articulated (species) ecological levels. The model for each species and the community total is selected and fitted using the tsglm function in the tscount package (Liboschik et al. 2017) and the forecast_nbgarch function in our portalPredictions package (WeecologyLab 2018)

GARCH models are generalized ARMA models and are defined according to their link function, response distribution, and two model structure parameters: the number of autoregressive terms (p) and the order of the moving average (q), and are represented as GARCH(p, q) (Liboschik et al. 2017). The nbGARCH model is fit using the log link and a negative binomial response (modeled as an over-dispersed Poisson), as well as with p = 1 (first-order autoregression) and q = 12 (approximately yearly moving average).

The tsglm function in the tscount package (Liboschik et al. 2017) uses a (conditional) quasi-likelihood based approach to inference and models the overdispersion as an additional parameter in a two-step approach. This two-stage approach has only been minimally evaluated, although preliminary simulation-based studies are promising (Liboschik, Fokianos, and Fried 2017).

## pevGARCH

pevGARCH (Poisson Environmental Variable Auto-Regressive Conditional Heteroskedasticity) is a generalized autoregressive conditional heteroskedasticity (GARCH) model fit to the data at the composite (full site and just control plots) spatial level and both the composite (community) and the articulated (species) ecological levels. The response variable is Poisson, and a variety of environmental variables are considered as covariates. The model for each species is selected and fitted using the tsglm function in the tscount package (Liboschik et al. 2017) and the forecast_pevgarch function in our portalPredictions package (WeecologyLab 2018).

GARCH models are generalized ARMA models and are defined according to their link function, response distribution, and two model structure parameters: the number of autoregressive terms (p) and the order of the moving average (q), and are represented as GARCH(p, q) (Liboschik et al. 2017). The pevGARCH model is fit using the log link and a Poisson response, as well as with p = 1 (first-order autoregression) and q = 12 (yearly moving average). The environmental variables potentially included in the model are min, mean, and max temperatures, precipitation, and NDVI.

The tsglm function in the tscount package (Liboschik et al. 2017) uses a (conditional) quasi-likelihood based approach to inference. This approach has only been minimally evaluated for models with covariates, although preliminary simulation-based studies are promising (Liboschik, Fokianos, and Fried 2017).

Each species is fit using the following (nonexhaustive) sets of the environmental covariates:

• max temp, mean temp, precipitation, NDVI
• max temp, min temp, precipitation, NDVI
• max temp, mean temp, min temp, precipitation
• precipitation, NDVI
• min temp, NDVI
• min temp
• max temp
• mean temp
• precipitation
• NDVI
• -none-

The final model is an intercept-only model. The single best model of the 11 is selected based on AIC.

## Ensemble

In addition to the base models, we include an ensemble, constructed using the base model AIC weights. The ensemble mean is calculated as the AIC-weighted mean of all model means. The ensemble variance is estimated as the sum of the AIC-weighted mean of all model variances and the variance of the estimated AIC-weighted mean, calculated using the unbiased estimate of sample variances.

# References

Box, G., and G. Jenkins. 1970. Time Series Analysis: Forecasting and Control. Holden-Day.

Hyndman, R. J. 2017. “forecast: Forecasting Functions for Time Series and Linear Models.” http://pkg.robjhyndman.com/forecast.

Hyndman, R. J., and G. Athanasopoulos. 2013. Forecasting: Principles and Practice. OTexts.

Hyndman, R. J., A. b. Koehler, J. K. Ord, and R. D. Snyder. 2008. Forecasting with Exponential Smoothing: The State Space Approach. Springer-Verlag.

Liboschik, T., K. Fokianos, and R. Fried. 2017. “tscount: An R Package for Analysis of Count Time Series Following Generalized Linear Models.” Journal of Statistical Software 82: 1–51. https://www.jstatsoft.org/article/view/v082i05.

Liboschik, T., R. Fried, K. Fokianos, and P. Probst. 2017. “tscount: Analysis of Count Time Series.” https://CRAN.R-project.org/package=tscount.

WeecologyLab. 2018. “Portal Forecasting.” https://github.com/weecology/portalPredictions/.