Activity2

Correlation, Covariance, and Partial Correlation in Time Series

In time series analysis, understanding relationships between different variables is crucial. Three fundamental statistical concepts that help capture these relationships are covariance, correlation, and partial correlation.

Covariance

Covariance measures how two variables move together. For two time series \(X_t\) and \(Y_t\), the sample covariance over \(n\) periods is given by:

\[ \text{Cov}(X, Y) = \frac{1}{n-1} \sum_{t=1}^{n} (X_t - \bar{X})(Y_t - \bar{Y}) \]

where \(\bar{X}\) and \(\bar{Y}\) are the sample means. A positive covariance indicates that when \(X_t\) is above its mean, \(Y_t\) tends to be above its mean as well, and vice versa.

Correlation

Correlation standardizes covariance, providing a dimensionless measure between -1 and 1:

\[ \rho_{XY} = \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y} \]

where \(\sigma_X\) and \(\sigma_Y\) are the standard deviations of \(X_t\) and \(Y_t\). A correlation of 1 means perfect positive linear relationship, -1 means perfect negative linear relationship, and 0 suggests no linear relationship.

Partial Correlation

Partial correlation measures the relationship between two variables while controlling for the effect of one or more additional variables. For three variables \(X\), \(Y\), and \(Z\), the partial correlation between \(X\) and \(Y\) controlling for \(Z\) is defined as:

\[ \rho_{XY \cdot Z} = \frac{\rho_{XY} - \rho_{XZ}\rho_{YZ}}{\sqrt{(1-\rho_{XZ}^2)(1-\rho_{YZ}^2)}} \]

This formula removes the influence of \(Z\) on both \(X\) and \(Y\), providing a clearer picture of their direct relationship.

Applied Example Using fpp3 Data

Consider the global_economy dataset from the fpp3 package, which contains economic indicators for various countries over time. We can explore relationships such as:

  • Covariance and correlation between Growth and CPI for different countries.
  • Partial correlation to understand the relationship between Imports and Exports while controlling for GDP.
Country Code Year GDP Growth CPI Imports Exports Population
Afghanistan AFG 1988 NA NA NA NA NA 11540888
Palau PLW 1979 NA NA NA NA NA 12124
Greece GRC 1981 52346507380 -1.553721 6.517618 25.75087 21.38946 9729350
Albania ALB 1960 NA NA NA NA NA 1608800
Hong Kong SAR, China HKG 1964 2206466461 8.627709 NA 80.58546 72.92227 3504600
New Caledonia NCL 1969 263108835 15.716313 NA NA NA 104000
Uganda UGA 1972 1491596639 NA NA 16.77934 19.38967 9988380
Niger NER 2003 2731416346 5.300000 81.907782 24.78620 15.19787 12656870
Iceland ISL 1975 1386032921 NA NA 41.62935 33.28791 217979
Lithuania LTU 1981 NA NA NA NA NA 3432947
# Filter data for Australia and select relevant variables
aus_data <- global_economy %>%
1  filter(Country == "Australia") %>%
2  dplyr::select(Year, GDP, Growth, Population, CPI, Imports, Exports)
1
Filter the global_economy dataset to include only rows where the country is Australia.
2
Select the variables of interest: Year, Growth, Population, and CPI
# Calculate covariance between GDP and Population for Australia
1cov_gdp_pop <- cov(aus_data$Growth, aus_data$CPI, use = "complete.obs")
cov_gdp_pop  
1
Compute the covariance between Growth and CPI for the Australian subset, using complete observations.
[1] -17.36472
# Calculate correlation between GDP and Population for Australia
1cor_gdp_pop <- cor(aus_data$Growth, aus_data$CPI, use = "complete.obs")
cor_gdp_pop  
1
Compute the Pearson correlation coefficient between GDP and Population.
[1] -0.2779003
# Compute correlation matrix for GDP, Population, and Life Expectancy
cor_matrix <- aus_data %>%
1  dplyr::select(GDP, Growth, Population, CPI, Imports, Exports) %>%
2  cor(use = "complete.obs")
cor_matrix
1
Select columns for GDP, Growth, Population, CPI, Imports, and Exports from the Australian data.
2
Calculate the correlation matrix for these variables using complete observations.
                  GDP     Growth Population        CPI    Imports    Exports
GDP         1.0000000 -0.2494090  0.9073355  0.9003771  0.7831384  0.8128970
Growth     -0.2494090  1.0000000 -0.2950084 -0.2779003 -0.1753012 -0.2251961
Population  0.9073355 -0.2950084  1.0000000  0.9899235  0.9054921  0.9114850
CPI         0.9003771 -0.2779003  0.9899235  1.0000000  0.9313180  0.9275880
Imports     0.7831384 -0.1753012  0.9054921  0.9313180  1.0000000  0.9238266
Exports     0.8128970 -0.2251961  0.9114850  0.9275880  0.9238266  1.0000000
Year        0.8801466 -0.2863494  0.9966214  0.9906993  0.9224200  0.9195759
                 Year
GDP         0.8801466
Growth     -0.2863494
Population  0.9966214
CPI         0.9906993
Imports     0.9224200
Exports     0.9195759
Year        1.0000000
  • Partial correlation to understand the relationship between Imports and Exports while controlling for GDP.
# Partial correlation between Imports and Exports controlling for GDP manually
# Extract relevant correlation coefficients from the matrix
1rho_Imports_Exports <- cor_matrix["Imports", "Exports"]
2rho_Imports_GDP <- cor_matrix["Imports", "GDP"]
3rho_Exports_GDP <- cor_matrix["Exports", "GDP"]
1
Extracts the direct correlation between Imports and Exports.
2
Extracts the correlation between Imports and GDP.
3
Extracts the correlation between Exports and GDP.
# Compute partial correlation between Imports and Exports controlling for GDP
pcorr_Imp_Exp_GDP <- (rho_Imports_Exports - rho_Imports_Exports * rho_Exports_GDP) /
  sqrt((1 - rho_Imports_GDP^2) * (1 - rho_Exports_GDP^2))
pcorr_Imp_Exp_GDP  # Display partial correlation result
[1] 0.4772659

Lab Activity: Recreate the above analysis using the class activity template and answer the following question:

  1. Apply the partial correlation formula to compute the relationship between GDP and Exports, controlling for Population.