Activity2

Correlation, Covariance, and Partial Correlation in Time Series

In time series analysis, understanding relationships between different variables is crucial. Three fundamental statistical concepts that help capture these relationships are covariance, correlation, and partial correlation.

Covariance

Covariance measures how two variables move together. For two time series \(X_t\) and \(Y_t\), the sample covariance over \(n\) periods is given by:

\[ \text{Cov}(X, Y) = \frac{1}{n-1} \sum_{t=1}^{n} (X_t - \bar{X})(Y_t - \bar{Y}) \]

where \(\bar{X}\) and \(\bar{Y}\) are the sample means. A positive covariance indicates that when \(X_t\) is above its mean, \(Y_t\) tends to be above its mean as well, and vice versa.

Correlation

Correlation standardizes covariance, providing a dimensionless measure between -1 and 1:

\[ \rho_{XY} = \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y} \]

where \(\sigma_X\) and \(\sigma_Y\) are the standard deviations of \(X_t\) and \(Y_t\). A correlation of 1 means perfect positive linear relationship, -1 means perfect negative linear relationship, and 0 suggests no linear relationship.

Partial Correlation

Partial correlation measures the relationship between two variables while controlling for the effect of one or more additional variables. For three variables \(X\), \(Y\), and \(Z\), the partial correlation between \(X\) and \(Y\) controlling for \(Z\) is defined as:

\[ \rho_{XY \cdot Z} = \frac{\rho_{XY} - \rho_{XZ}\rho_{YZ}}{\sqrt{(1-\rho_{XZ}^2)(1-\rho_{YZ}^2)}} \]

This formula removes the influence of \(Z\) on both \(X\) and \(Y\), providing a clearer picture of their direct relationship.

Applied Example Using fpp3 Data

Consider the global_economy dataset from the fpp3 package, which contains economic indicators for various countries over time. We can explore relationships such as:

  • Covariance and correlation between Growth and CPI for different countries.
  • Partial correlation to understand the relationship between Imports and Exports while controlling for GDP.
Country Code Year GDP Growth CPI Imports Exports Population
IDA & IBRD total IBT 2006 1.226263e+13 7.896002 NA 28.69816 32.65433 5506476109
East Asia & Pacific (excluding high income) EAP 1993 8.852067e+11 10.891181 NA 23.63895 21.16557 1674021645
Mali MLI 1974 5.387473e+08 -1.528826 NA 40.87015 12.72373 6368348
Caribbean small states CSS 1987 1.522505e+10 1.719369 NA NA NA 5908886
Croatia HRV 1986 NA NA 0.0001808 NA NA 4722000
St. Martin (French part) MAF 2006 NA NA NA NA NA 28414
Greece GRC 1996 1.458616e+11 2.862129 61.6802915 23.20928 14.28696 10608800
South Africa ZAF 1972 2.135814e+10 1.654762 2.6464111 22.17079 24.63150 24148137
Somalia SOM 2011 NA NA NA NA NA 12404725
Lower middle income LMC 1964 NA 6.188035 NA 12.20202 10.06941 1024253768
# Filter data for Australia and select relevant variables
aus_data <- global_economy %>%
1  filter(Country == "Australia") %>%
2  dplyr::select(Year, GDP, Growth, Population, CPI, Imports, Exports)
1
Filter the global_economy dataset to include only rows where the country is Australia.
2
Select the variables of interest: Year, Growth, Population, and CPI
# Calculate covariance between GDP and Population for Australia
1cov_gdp_pop <- cov(aus_data$Growth, aus_data$CPI, use = "complete.obs")
cov_gdp_pop  
1
Compute the covariance between Growth and CPI for the Australian subset, using complete observations.
[1] -17.36472
# Calculate correlation between GDP and Population for Australia
1cor_gdp_pop <- cor(aus_data$Growth, aus_data$CPI, use = "complete.obs")
cor_gdp_pop  
1
Compute the Pearson correlation coefficient between GDP and Population.
[1] -0.2779003
# Compute correlation matrix for GDP, Population, and Life Expectancy
cor_matrix <- aus_data %>%
1  dplyr::select(GDP, Growth, Population, CPI, Imports, Exports) %>%
2  cor(use = "complete.obs")
cor_matrix
1
Select columns for GDP, Growth, Population, CPI, Imports, and Exports from the Australian data.
2
Calculate the correlation matrix for these variables using complete observations.
                  GDP     Growth Population        CPI    Imports    Exports
GDP         1.0000000 -0.2494090  0.9073355  0.9003771  0.7831384  0.8128970
Growth     -0.2494090  1.0000000 -0.2950084 -0.2779003 -0.1753012 -0.2251961
Population  0.9073355 -0.2950084  1.0000000  0.9899235  0.9054921  0.9114850
CPI         0.9003771 -0.2779003  0.9899235  1.0000000  0.9313180  0.9275880
Imports     0.7831384 -0.1753012  0.9054921  0.9313180  1.0000000  0.9238266
Exports     0.8128970 -0.2251961  0.9114850  0.9275880  0.9238266  1.0000000
Year        0.8801466 -0.2863494  0.9966214  0.9906993  0.9224200  0.9195759
                 Year
GDP         0.8801466
Growth     -0.2863494
Population  0.9966214
CPI         0.9906993
Imports     0.9224200
Exports     0.9195759
Year        1.0000000
  • Partial correlation to understand the relationship between Imports and Exports while controlling for GDP.
# Partial correlation between Imports and Exports controlling for GDP manually
# Extract relevant correlation coefficients from the matrix
1rho_Imports_Exports <- cor_matrix["Imports", "Exports"]
2rho_Imports_GDP <- cor_matrix["Imports", "GDP"]
3rho_Exports_GDP <- cor_matrix["Exports", "GDP"]
1
Extracts the direct correlation between Imports and Exports.
2
Extracts the correlation between Imports and GDP.
3
Extracts the correlation between Exports and GDP.
# Compute partial correlation between Imports and Exports controlling for GDP
pcorr_Imp_Exp_GDP <- (rho_Imports_Exports - rho_Imports_Exports * rho_Exports_GDP) /
  sqrt((1 - rho_Imports_GDP^2) * (1 - rho_Exports_GDP^2))
pcorr_Imp_Exp_GDP  # Display partial correlation result
[1] 0.4772659

Lab Activity: Recreate the above analysis using the class activity template and answer the following question:

  1. Apply the partial correlation formula to compute the relationship between GDP and Exports, controlling for Population.