A Note on the Invertibility of Nonlinear ARMA models

Preparing to load PDF file. please wait...

0 of 0
A Note on the Invertibility of Nonlinear ARMA models

Transcript Of A Note on the Invertibility of Nonlinear ARMA models

A Note on the Invertibility of Nonlinear ARMA models
Kung-Sik Chan Department of Statistics & Actuarial Science University of Iowa, Iowa City, IA 52242, U.S.A.
Email: [email protected]
Howell Tong Department of Statistics London School of Economics, London WC2A 2AE, U.K. Email: [email protected]
April, 2009
Abstract We review the concepts of local and global invertibility for a nonlinear auto-regressive moving-average (NLARMA) model. Under very general conditions, a local invertibility analysis of a NLARMA model admits the generic dichotomy that the innovation reconstruction errors either diminish geometrically fast or grow geometrically fast. We derive a simple sufficient condition for a NLARMA model to be locally invertible. The invertibility of the polynomial MA models is revisited. Moreover, we show that the Threshold MA models may be globally invertible even though some component MA models are non-invertible. One novelty of our approach is its cross-fertilization with dynamical systems.
Keywords: Attractor, Dynamical system, Nonlinear time series, Polynomial MA model, Subadditive ergodic theory, Threshold MA model.

1 Introduction
Despite the growing literature on nonlinear time series analysis (Priestley, 1988, Tong, 1990, Franses and van Dijk, 2000, Chan and Tong, 2001, Fan and Yao, 2003, Small, 2005, and Gao, 2007), the general framework makes use of nonlinear autoregressive models. In contrast, nonlinear moving-average (NLMA) models are relatively under-explored. Part of the problem contributing to the slow development, both empirical and theoretical, on NLMA models is due to the difficulty in establishing the invertibility of a NLMA model.
Here, we focus on nonlinear autoregressive moving-average (NLARMA) models, and discuss two concepts of invertibility for these models. We illustrate these concepts with the polynomial MA models and the threshold MA models.

2 Nonlinear Auto-Regressive Moving-average Models

The linear moving-average (MA) model of order q is characterized by the feature

that it has memory of q lags. Recall that an MA(q) process {Yt} is defined by the


Yt = µ + εt − θiεt−i,

where µ is the mean of Yt and the innovations {εt} are white noise, i.e. uncorrelated random variables of zero mean and finite (identical) variance σ2. For simplicity, we

shall assume µ = 0. It is well known that the autocorrelation function (ACF) of an

MA(q) process has a cut-off after lag q, i.e. corr(Yt, Yt− ) = 0 for > q. In other

words, the process is uncorrelated with its q+1-th or higher lags. If the innovations

are, furthermore, jointly independent, then the process is independent of its q + 1-

th or higher lags, in which case the process is said to have finite memory. The

natural question of developing nonlinear models with finite memory has received

some attention in the literature.

Here, we review some nonlinear time-series models of finite memory. Hence-

forth, we shall assume that {εt} is an independent and identically distributed

sequence of random variables with zero mean and finite variance. To begin with,


we note that any model of the following form is of finite memory:

Yt = εt + h(εt−1, · · · , εt−q; θ),


where h(·; θ) is a known function for known parameter θ. The linear MA(q) model is obtained by setting h to be a linear function. Similarly, a nonlinear finitememory model, also known as a nonlinear moving-average (NLMA) model, can be obtained by setting h to be some parametric nonlinear functions. Clearly, any nonlinear moving-average model is stationary. However, as in the case of linear moving-average models, the issue of invertibility is pivotal. Invertibility refers to the feasibility of reconstructing the innovations from the observations, assuming that the true model is known. Given the parameter θ, Eqn. (1) can be inverted to define the residuals

εˆt = Yt − h(εˆt−1, · · · , εˆt−q; θ),


where the initial values are generally set as εˆ1−k = 0, the mean of the innovations, for k = 1, · · · , q. The polynomial moving-average model (Robinson, 1977) is obtained by letting h be a polynomial. For example,
Yt = εt + βε2t−1
is a simple quadratic MA(1) model. However, it has been noted that polynomial MA models are generally non-invertible (e.g. Granger and Andersen, 1978b), which makes them not suitable for prediction purpose, and also makes it hard to carry out model diagnostics. We shall elaborate on the concept of invertibility in the following sections.
Several interesting mixed nonlinear ARMA (in short NLARMA) models that may be invertible have been proposed in the literature. An NLARMA(p, q) model is defined by a stochastic difference equation of the following form:

Yt = εt + h(Yt−1, . . . , Yt−p, εt−1, · · · , εt−q; θ),


A sub-class of the NLARMA models belongs to the family of bilinear models (Granger and Andersen, 1978a, Subba Rao, 1981, Guegan and Pham Dinh, 1987 Priestley, 1988, Tong, 1990); they are linear in both past lags of the process and past lags of the innovation, e.g.

Yt = εt + θYt−1εt−1


is a simple bilinear model where θ is a parameter. Some results on the invertibility of sub-classes of bilinear models have been derived and surveyed in the aforementioned works.
Recently, Ling and Tong (2005) re-visited the Threshold MA (TMA) model, which specifies that the process switches from one linear MA model to another linear MA model whenever some lag of the process exceeds one of the threshold values. A simple TMA model of order one and with one threshold takes the following form:

 εt − θ1,1εt−1 if Yt−d ≤ r

Yt =


 εt − θ2,1εt−1 otherwise,

where the θ’s and r are parameters; d is a positive integer parameter known as the delay parameter. A more general TMA model will be considered later. It is straightforward to generalize the TMA model to Threshold ARMA (TARMA) model by replacing the linear MA sub-models to linear ARMA sub-models. Ling and Tong (2005) gave some sufficient conditions for the invertibility of a TMA model of order one and with multiple thresholds. For example, they considered the following model:


Yt = {φ0 + ψjI(rj−1 < Yt−1 ≤ rj)}εt−1 + εt,



where I(A) is the indicator function of the event A, and −∞ = r0 < r1 < · · · < rk = ∞ are the k thresholds. Let Fy(·) denote the cumulative distribution function of Y . They established the following theorem, which gives an almost necessary and sufficient condition:

THEOREM 1. {Yt} is invertible if



ψ | } Fy(rj )−Fy(rj−1) j






invertible if



ψ | } Fy(rj )−Fy(rj−1) j



Note that the case with kj=1{|φ0+ψj|Fy(rj)−Fy(rj−1)} = 1 is undecided but they conjectured non-invertibility. Note also that the MA coefficients of intermediate linear MA sub-models also feature in the invertibility condition. However, for TMA models of higher order, they were only able to give some rather restrictive sufficient conditions.


3 Global and Local Invertibility

In this section, we elaborate on the concept of invertibility for NLARMA models. For conciseness, we focus on the case of an NLMA model defined by (1) for which the innovations may be estimated by the residuals defined by (2), but note that all results in this section and the next can be extended to the case of NLARMA models. On the other hand, the innovations satisfy a similar difference equation:

εt = Yt − h(εt−1, · · · , εt−q; θ),

so that the reconstruction errors Wt = εˆt − εt satisfy the equation

Wt = h(εt−1, · · · , εt−q; θ) − h(Wt−1 + εt−1, · · · , Wt−q + εt−q; θ),


which is generally a random-coefficient stochastic difference equation for {Wt}. Invertibility requires that the reconstruction errors {Wt} approach zero in some sense, e.g., in probability. Conditions for invertibility are then simply conditions for the solutions of the difference equation (6) to approach 0 as t → ∞, in probability. For linear MA models, the necessary and sufficient condition for invertibility is well known. Let
Yt = εt − θ1εt−1 − · · · − θqεt−q.


Wt − θ1Wt−1 − · · · − θqWt−q = 0,

in which case the condition of invertibility is that all roots of the characteristic equation
1 − θ1x − · · · − θqxq = 0

lie outside the unit circle; see, e.g., Box, Jenkins and Reinsel (1994) and Cryer and Chan (2008).
However, for the NLMA models, general conditions for invertibility seem difficult to obtain, as it is generally difficult to derive necessary and sufficient conditions for zero to be a global attractor for all solutions of (6). Before proceeding further, we note that we can vectorize (6) into a first-order vector equation for the case that q > 1. Let Wt = (Wt, Wt−1, · · · , Wt−q+1)T , and εt = (εt, εt−1, · · · , εt−q+1)T .



Wt = F (Wt−1; εt−1, θ) = (h(εt−1, · · · , εt−q; θ) − h(Wt−1 + εt−1, · · · , Wt−q + εt−q; θ), Wt−1, · · · , Wt−q+1)T . (7)

(For an NLARMA model defined by (3), F is a function of (Wt−1; Yt−1, . . . , Yt−p, εt−1, θ).) Clearly 0 = F (0; ε, θ) for all ε so that the origin is an equilibrium point (for the dynamical model defined by (7)). Then, invertibility is equivalent to the origin 0 ∈ Rq being an asymptotically global attractor, in probability. It is often hard to study the global nature of the origin. Hence, a weaker form of invertibility has been studied in the literature (e.g. Granger and Anderson, 1978b) which requires the origin to be locally and asymptotically stable. In other words, local invertibility concerns whether the innovations can be asymptotically recovered if the initial conditions are approximately correct. Local invertibility can be assessed by linearizing F around the origin. Let F˙ = ∂∂W F evaluated at W = 0 and εt−1 be concisely denoted as Ft. Then, the local asymptotic stability of the origin can be inferred from the stability of the origin for the random-coefficient linear stochastic difference equation (c.f. Grobman and Hartman Theorem on p. 237 of Chan and Tong, 2001):

Wt = FtWt−1.


We now illustrate these concepts with the simple quadratic MA(1) model:

Yt = εt − βε2t−1,

where β = 0. It is straightforward to show that the reconstruction errors Wt satisfy the stochastic equation:

Wt = Wt−1(βWt−1 + 2βεt−1).


It can be shown that Wt diverges to infinity with positive probability if |W0| > w0 where w0 = 2/|β| with positive probability. Subject to the latter condition, the claim of transience of {Wt} can be justified as follows. Let γ = E|εt|. Markov inequality implies that the event |Wt| ≥ 2tw0 for all t occurs with probability not smaller than P rob(|W0| > w0) × ∞ t=1(1 − |β|22γtw|β0|−2 ), which is positive. The

preceding argument can be adapted to show that polynomial MA models of degree greater than 1 are non-invertible under similar conditions, along similar lines as employed in Chan and Tong (1994, Theorem 2) for the transience of polynomial AR models. It should be noted that Granger and Anderson (1978b) claimed that ‘a set of models which is not invertible for any non-zero values of its parameters consists of non-linear moving averages’. However, we are unaware of any rigorous proof of the claim to-date. Later, we shall state, in Theorem 2, precise conditions under which a NLMA model is locally invertible/non-invertible with rigorous proof. For now, the preceding analysis reveals that the non-invertibility of the polynomial MA model is associated with an unbounded support for the initial reconstruction error. Below, we conduct a local analysis that shows that the polynomial MA model may be invertible for the case of innovations with sufficiently small support around the origin.
Next, we illustrate the concept of local invertibility with the preceding simple quadratic MA(1) model. Linearizing (9) around Wt = 0 yields the following linear model:
Wt = 2βεt−1Wt−1,
where the coefficient is random and equals 2βεt−1. The solution of the equation is trivial, it being
Wt = W0(2β)t εs−1.
In particular
|Wt| = |W0| exp(n {ln |(2β| + ln |εs−1|}/n).
Hence if ln |2β|+E ln |ε| < 0, then the law of large numbers implies that the origin is asymptotically stable so that the model is locally invertible. If ln |2β|+E ln |ε| > 0, then the origin is locally unstable so that the model is not locally invertible. The case ln |2β| + E ln |ε| = 0 is delicate and requires further analysis that will not be pursued here. This example shows that the conclusions concerning invertibility from a local analysis and a global analysis can differ. Furthermore, the global noninvertibility is predicated on the condition that the initial reconstruction errors can be arbitrarily large, with positive probability. On the other hand, the local

analysis suggests that if the errors are of sufficiently small bounded support, then the model can be invertible if the initial conditions respect the bounded support condition for the innovations.

4 Dichotomy of Local Invertibility Analysis
Recall that a linear MA(q) model is invertible if and only if (or iff for short) all the roots of the characteristic equation lie outside the unit circle. This result follows from a stability analysis of the difference equation for the reconstruction errors which satisfy the equation (with F being a companion matrix whose first row equals (θ1, θ2, · · · , θq)):
Wt = F Wt−1 = F tW0,

and the fact that the asymptotic behavior of F t depends solely on the largest

eigenvalue of F in magnitude which is smaller than 1 iff the root condition alluded

to above holds. Indeed, let λ1 be the largest eigenvalue of F in magnitude. It is

well known that λ1 is less than 1 in magnitude iff all roots of the characteristic

equation are outside the unit circle, in which case, for almost all initial W0 with respect to the Lebesgue measure, |Wt| ∼ λt1, meaning that the ratio of the two

terms is bounded, and hence the reconstruction errors vanish geometrically fast.

(Here |Wt| denotes the Euclidean norm of the vector Wt.) On the other hand,

λ1 is larger than 1 in magnitude iff some root of the characteristic equation is

inside the unit circle, in which case the reconstruction errors grow exponentially in

magnitude, and hence the model is non-invertible. If |λ| = 1, then the model is still

non-invertible since the reconstruction errors preserve their magnitude. However,

the last case happens with zero Lebesgue measure. Thus, the generic situation is

the dichotomy that the reconstruction errors of a linear MA model either vanish

geometrically fast or they grow exponentially fast.

It turns out that this dichotomy holds for a local invertibility analysis for any

nonlinear MA model. To see this, note that (8) entails that

Wt = ( Fs)W0.



Recall that Fs = F˙ (0, εs; θ) is a function of εs = (εs, · · · , εs−q+1)T . Under very general conditions, a product of random matrices of the above form asymptotically behaves like the power of some constant matrix. Specifically, Theorem C of Cohen (1988) states that if E(max(0, log F1 ) < ∞, then

lim t−1 log


with probability 1, where · denotes a matrix norm for which AB ≤ A B

for any matrices A and B, and −∞ ≤ ξ < ∞ is a constant; furthermore,

ξ = lim t−1E(log

t s=1



The preceding result of Cohen follows from the

general subadditive ergodic theory of Kingman (1973). The determination of ξ

is, however, a generally hard problem, except that for the scaler case, i.e. q = 1,

ξ = E(log F1 ), by the independence of the Fs’s. Otherwise, only in rare cases

does ξ admit a closed-form expression. Finally, we note that the preceding result

on the asymptotic behavior of the product of the random matrices holds if Fs

is a function of a stationary ergodic process; such an extension is useful for the

invertibility analysis of a NLARMA model.

In particular, if we take · to be the spectral norm (the maximum eigenvalue

in magnitude), the preceding result implies that there exists a constant ξ such

that the local reconstruction errors |Wt| ∼ (exp ξ)t, as t → ∞ for almost all initial

reconstruction error (w.r.t. the Lebesgue measure). Hence, the model is locally

invertible iff ξ < 0. Therefore, the necessary and sufficient conditions for local

invertibility of a nonlinear MA model hinge on deriving conditions for ξ to be less

than 0. As mentioned earlier, the determination of ξ is generally a hard problem.

Nevertheless, simple sufficient conditions for ξ < 0 can be obtained by noting that

for any fixed positive integer m,

lim t−1E(log Fs ) ≤ m−1E(log( F1F2 · · · Fm ).

To see this, let t = mk + r where k and r are integers and 0 ≤ r < m. Recall the matrix norm has the property that AB ≤ A B for any matrices A and B. It follows from this property and stationarity that



Fs ) ≤ t−1kE(log( F1F2 · · · Fm ) + t−1Er(log F1 )



from which the claimed result can be obtained by passing to the limit. In particular, we obtain the following result.
THEOREM 2. The nonlinear MA(q) model defined by (1) is locally invertible if E(log F1 ) < 0 where · is a matrix norm, e.g. the spectral norm. For the case q = 1, the model is locally non-invertible if E(log F1 ) > 0
The non-invertibility result for q = 1 stated above follows trivially from the fact that ξ = E(log F1 ), in the scaler case. So far, we assume that the underlying process is an NLMA process, but the preceding theorem can be extended readily to the case of a stationary NLARMA model defined by (3), for which Fs is a function of Ys−1, . . . , Ys−p, εs−1 and θ. Furthermore, if h in (3) is conditionally linear in the innovations given the Y ’s, then the local invertibility analysis is equivalent to global invertibility analysis.

5 Threshold MA Model Revisited

Ling and Tong (2005) studied the Threshold MA (TMA) model, which is a piecewise linear MA model. For simplicity, consider the simple case of the two regimes:



Yt = εt − I(Yt−d ≤ r) θ1,jεt−j − I(Yt−d > r) θ2,jεt−j




where the θ’s are parameters and r the unknown threshold parameter and d is a positive integer known as the delay parameter. Intuitively, the transition of a TMA process switches between two MA processes where the MA process indexed by the parameter vector (θ1,1, . . . , θ1,q)T is in operation if the process at lag d is below the threshold r, otherwise the MA process indexed by the parameter vector (θ2,1, . . . , θ2,q)T is operational. For the TMA model, the reconstruction errors satisfy (8) with Ft being a companion matrix with its first row equal to (θ1,1, . . . , θ1,q)I(Yt−d ≤ r) + (θ2,1, . . . , θ2,q)I(Yt−d > r). The remark below Theorem 2 then implies that the TMA model is invertible if the spectral norms of the two sub-MA processes are less than 1. That is, a TMA model is invertible if all the roots of the two characteristic equations

1 − θi,jxj = 0, i = 1, 2