Correlation and Autocorrelation in Time Series - Lab
Introduction
In this lab, you'll practice your knowledge of correlation, autocorrelation, and partial autocorrelation by working on three different datasets.
Objectives
In this lab you will:
- Plot and discuss the autocorrelation function (ACF) for a time series
- Plot and discuss the partial autocorrelation function (PACF) for a time series
The Exchange Rate Data
We'll be looking at the exchange rates dataset again.
- First, run the following cell to import all the libraries and the functions required for this lab
- Then import the data in
'exch_rates.csv'
- Change the data type of the
'Frequency'
column - Set the
'Frequency'
column as the index of the DataFrame
# Import all packages and functions
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
%matplotlib inline
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from matplotlib.pylab import rcParams
# Import data
xr = None
# Change the data type of the 'Frequency' column
# Set the 'Frequency' column as the index
Plot all three exchange rates in one graph:
# Plot here
You can see that the EUR/USD and AUD/USD exchange rates are somewhere between 0.5 and 2, whereas the Danish Krone is somewhere between 4.5 and 9. Now let's look at the correlations between these time series.
# Correlation
What is your conclusion here? You might want to use outside resources to understand what's going on.
Next, look at the plots of the differenced (1-lag) series. Use subplots to plot them rather than creating just one plot.
# 1-lag differenced series
xr_diff = None
# Plot
Calculate the correlation of this differenced time series.
# Correlation
Explain what's going on
Next, let's look at the "lag-1 autocorrelation" for the EUR/USD exchange rate.
- Create a "lag-1 autocorrelation" series
- Combine both the original and the shifted ("lag-1 autocorrelation") series into a DataFrame
- Plot these time series, and look at the correlation coefficient
# Isolate the EUR/USD exchange rate
eur = xr[['Euro']]
# "Shift" the time series by one period
eur_shift_1 = None
# Combine the original and shifted time series
lag_1 = None
# Plot
# Correlation
Repeat this for a "lag-50 autocorrelation".
# "Shift" the time series by 50 periods
eur_shift_50 = None
# Combine the original and shifted time series
lag_50 = None
# Plot
# Correlation
What's your conclusion here?
Knowing this, let's plot the ACF now.
# Plot ACF
The series is heavily autocorrelated at first, and then there is a decay. This is a typical result for a series that is a random walk, generally you'll see heavy autocorrelations first, slowly tailing off until there is no autocorrelation anymore.
Next, let's look at the partial autocorrelation function plot.
# Plot PACF
This is interesting! Remember that Partial Autocorrelation Function gives the partial correlation of a time series with its own lagged values, controlling for the values of the time series at all shorter lags. When controlling for 1 period, the PACF is only very high for one-period lags, and basically 0 for shorter lags. This is again a typical result for random walk series!
The Airpassenger Data
Let's work with the air passenger dataset you have seen before. Plot the ACF and PACF for both the differenced and regular series.
Note: When plotting the PACF, make sure you specify
method='ywm'
in order to avoid any warnings.
# Import and process the air passenger data
air = pd.read_csv('passengers.csv')
air['Month'] = pd.to_datetime(air['Month'])
air.set_index('Month', inplace=True)
air.head()
# Plot ACF (regular)
# Plot PACF (regular)
# Generate a differenced series
air_diff = None
# Plot ACF (differenced)
# Plot PACF (differenced)
Your conclusion here
The NYSE data
Are you getting the hang of interpreting ACF and PACF plots? For one final time, plot the ACF and PACF for both the NYSE time series.
Note: When plotting the PACF, make sure you specify
method='ywm'
in order to avoid any warnings.
# Import and process the NYSE data
nyse = pd.read_csv('NYSE_monthly.csv')
nyse['Month'] = pd.to_datetime(nyse['Month'])
nyse.set_index('Month', inplace=True)
nyse.head()
# Plot ACF
# Plot PACF
Your conclusion here
Summary
Great, you've now been introduced to ACF and PACF. Let's move into more serious modeling with autoregressive and moving average models!