How to read kpss test in eviewsinc

Can a trend stationary series be modeled with ARIMA?


I have a question / confusion about stationary series needed for modeling with ARIMA (X). I think about this more in terms of conclusion (effect of an intervention), but I want to know if prognosis and conclusion make a difference in response.

Question:

All of the introductory resources I've read state that the series must be stationary, which makes sense to me and where the "I" arrives in arima (differentiating).

What confuses me is the use of trends and deviations in ARIMA (X) and the potential impact on inpatient requirements.

Does using a constant / drift term and / or a trend variable as an exogenous variable (i.e. adding 't' as a regressor) negate the requirement that the series be stationary? Does the answer differ depending on whether the series has a unit root (e.g. adf-test) or a deterministic trend but no unit root?

OR

Does a series always have to be stationary before using ARIMA (X), i.e. through differentiation and / or separation?

Reply:


In the comments it seems that we haven't addressed the question of how to choose between a deterministic or a stochastic trend. That means how to proceed in practice instead of considering the consequences or characteristics of each individual case.

You can do the following: First, apply the ADF test.

  • If the zero of a root of unity is rejected, we are done. The trend (if any) can be represented by a deterministic linear trend.
  • If the ADF test zero is not rejected, then we apply the KPSS test (where the null hypothesis is the opposite, stationarity or stationarity around a linear trend).

    o If the KPSS test zero is rejected, we conclude that there is a root of unity and work with the first differences in the data. At the first differences in the series, we can test the importance of other regressors or choose an ARMA model.

    o If the KPSS test zero is not rejected, we have to say that the data is not very meaningful because we could not reject any of the null hypotheses. In this case, it may be safer to work with the first differences in the series.

Remember that as mentioned in a previous answer, these tests can be influenced by outliers (e.g. an outlier at a certain point in time due to an error in recording the data or a level shift that z series from a certain point in time) . Therefore, it is advisable to review these issues as well and repeat the previous analysis after considering regressors for some potential outliers.







Remember that there are different types of non-stationarity and different ways of dealing with them. Four common ones are:

1) Deterministic trends or trend stationarity. If your series is such a series, you can de-trend it or include a time trend in the regression / model. You may want to read the Frisch-Waugh-Lovell Theorem on this subject.

2) Shifts in planes and structural breaks. If so, consider adding a dummy variable for each pause, or if your sample is long enough, model each regime separately.

3) Change variance. Either model the samples separately or model the changing variance with the modeling class ARCH or GARCH.

4) If your series contains a root of unity. In general, you should then check whether the relationships between the variables grow together. However, since you are dealing with univariate forecasting, you should differentiate once or twice depending on the integration order.

To model a time series using the ARIMA modeling class, the following steps should be appropriate:

1) Look at the ACF and PACF along with a time series plot to see if the series is stationary or unsteady.

2) Test the series for a root of unity. This can be done using a variety of tests, the most common of which are the ADF test, the Phillips-Perron (PP) test, the KPSS test with zero stationarity, or the DF-GLS test, which is on the most efficient is the above tests. NOTE! If there is a structural break in your series, these tests are geared towards not rejecting the zero of a root of unity. If you want to test the robustness of these tests and suspect one or more structural breaks, you should use endogenous structural break tests. Two common ones are the Zivot-Andrews test, in which an endogenous structural break is possible, and the Clemente-Montañés-Reyes test, in which two structural breaks are possible. The latter enables two different models.

3) If there is a root of unity in the series, you should distinguish the series. You should then look through the ACF, PACF and time series diagrams and look for a second unit root to be on the safe side. The ACF and PACF will help you decide how many AR and MA terms to include.

4) If the series does not have a unit root, but the time series graph and ACF show that the series has a deterministic trend, consider adding a trend when fitting the model. Some people argue that it is perfectly correct to distinguish the series only when it contains a deterministic trend, although information can be lost in the process. Even so, it's a good idea to make a difference to see how many AR and / or MA terms to include. However, a trend over time is valid.

5) Customize the different models and do the usual diagnostic checks. You may want to use an informational criterion or the MSE to help you choose the best model for the sample you are going to fit it on.

6) Do a sample prediction for the most appropriate models and calculate loss functions like MSE, MAPE, MAD to see which of them perform the best in forecasting because that's what we want to do!

7) Make your out-of-sample predictions like a boss and enjoy your results!







Determining whether the trend (or some other component such as seasonality) is deterministic or stochastic is part of the puzzle in time series analysis. I will add a few points to what has been said.

1) The distinction between deterministic and stochastic trends is important because if there is a unit root in the data (e.g. a random run) the test statistics used for inference do not follow the traditional distribution. In this post you will find some details and references.

We can simulate a random walk (stochastic trend where the first differences should be taken), test the significance of the deterministic trend, and find the percentage of cases where the zero of the deterministic trend is discarded. In R we can do:

At a significance level of 5%, we would assume that the zero is rejected 95% of the time. In this experiment, however, it was only rejected ~ 89% of the 10,000 simulated random migrations.

We can use unit root tests to test whether there is a unit root. However, we must be aware that a linear trend can in turn lead to the fact that the zero of a unit root is not rejected. To handle this, the KPSS test looks at the zero point of stationarity around a linear trend.

2) Another topic is the interpretation of the deterministic components in a process in terms of levels or initial differences. The effect of a section is not the same in a linear trend model as it is in a random step. See this post for an illustration.

yt = μ + yt − 1 + ϵt, ϵt∼NID (0, σ2).

yt − iyt

yt === ... μ + yt − 1μ + yt − 2 + ϵt − 1 + ϵt2μ + yt − 2μ + yt − 3 + ϵt − 2 + ϵt − 1 + ϵt3μ + yt − 3 + ϵt − 2 + ϵt − 1 + ϵt

We come to:

yt = y0 + μt + ∑i = 1tϵi

y0μμμ

If the graph of a series shows a fairly clear linear trend, we cannot be sure whether this is due to the presence of a deterministic linear trend or to a drift in a randomization process. Supplementary graphs and test statistics should be applied.

There are some precautionary measures to be taken as analysis based on the unit root and other test statistics is not foolproof. Some of these tests can be influenced by deviating observations or level shifts and require the selection of an order of delays, which is not always easy.

As a workaround for this puzzle, I find common practice to capture data differences until the series looks stationary (e.g. the autocorrelation function which should go to zero quickly) and then choose an ARMA model.




Very interesting question, I would also like to know what others have to say. I'm a trained engineer and not a statistician, so someone can check my logic. Since we as engineers would like to simulate and experiment, I was motivated to simulate and test your question.

As shown empirically in the following, the use of a trend variable in ARIMAX canceled the differentiation and made the series trend stationary. Here is the logic I used to check.

  1. Simulated an AR process
  2. A deterministic trend was added
  3. Using ARIMAX, with trend as an exogenous variable, models the above series without differentiation.
  4. Checked the leftovers for white noise and it's purely random

Below is the R code and drawings:

AR (1) Simulated plot

AR (1) with deterministic trend

ARIMAX Residual PACF with trend as exogenous. Residulas are random and no longer have a pattern

As can be seen above, the modeling of the deterministic trend as an exogenous variable in the ARIMAX model makes differentiation superfluous. At least in the deterministic case it worked. I wonder how this would behave with stochastic trends that are difficult to predict or model.

To answer your second question, YES all ARIMA including ARIMAX must be made stationary. At least that's what textbooks say.

You will also find commented on this article. Very clear explanation of deterministic trend vs. stochastic trend and how to remove it to make the trend stationary, as well as a very nice literature review on the subject. You'll use it in the neural network context, but it's useful for general time series problems. Your final recommendation, when it is clearly identified as a deterministic trend, is to do the linear trend, or use differentiation to make the time series stationary. The jury is still out there, but most of the researchers cited in this article recommend differentiation as opposed to linear detrending.

To edit:

Below is a random run with a stochastic drift process using exogenous variables and differential arima. Both seem to give the same answer and are essentially the same.

Hope that helps!