Country | Code | Year | GDP | Growth | CPI | Imports | Exports | Population |
---|---|---|---|---|---|---|---|---|
Afghanistan | AFG | 1988 | NA | NA | NA | NA | NA | 11540888 |
Palau | PLW | 1979 | NA | NA | NA | NA | NA | 12124 |
Greece | GRC | 1981 | 52346507380 | -1.553721 | 6.517618 | 25.75087 | 21.38946 | 9729350 |
Albania | ALB | 1960 | NA | NA | NA | NA | NA | 1608800 |
Hong Kong SAR, China | HKG | 1964 | 2206466461 | 8.627709 | NA | 80.58546 | 72.92227 | 3504600 |
New Caledonia | NCL | 1969 | 263108835 | 15.716313 | NA | NA | NA | 104000 |
Uganda | UGA | 1972 | 1491596639 | NA | NA | 16.77934 | 19.38967 | 9988380 |
Niger | NER | 2003 | 2731416346 | 5.300000 | 81.907782 | 24.78620 | 15.19787 | 12656870 |
Iceland | ISL | 1975 | 1386032921 | NA | NA | 41.62935 | 33.28791 | 217979 |
Lithuania | LTU | 1981 | NA | NA | NA | NA | NA | 3432947 |
Activity2
Correlation, Covariance, and Partial Correlation in Time Series
In time series analysis, understanding relationships between different variables is crucial. Three fundamental statistical concepts that help capture these relationships are covariance, correlation, and partial correlation.
Covariance
Covariance measures how two variables move together. For two time series \(X_t\) and \(Y_t\), the sample covariance over \(n\) periods is given by:
\[ \text{Cov}(X, Y) = \frac{1}{n-1} \sum_{t=1}^{n} (X_t - \bar{X})(Y_t - \bar{Y}) \]
where \(\bar{X}\) and \(\bar{Y}\) are the sample means. A positive covariance indicates that when \(X_t\) is above its mean, \(Y_t\) tends to be above its mean as well, and vice versa.
Correlation
Correlation standardizes covariance, providing a dimensionless measure between -1 and 1:
\[ \rho_{XY} = \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y} \]
where \(\sigma_X\) and \(\sigma_Y\) are the standard deviations of \(X_t\) and \(Y_t\). A correlation of 1 means perfect positive linear relationship, -1 means perfect negative linear relationship, and 0 suggests no linear relationship.
Partial Correlation
Partial correlation measures the relationship between two variables while controlling for the effect of one or more additional variables. For three variables \(X\), \(Y\), and \(Z\), the partial correlation between \(X\) and \(Y\) controlling for \(Z\) is defined as:
\[ \rho_{XY \cdot Z} = \frac{\rho_{XY} - \rho_{XZ}\rho_{YZ}}{\sqrt{(1-\rho_{XZ}^2)(1-\rho_{YZ}^2)}} \]
This formula removes the influence of \(Z\) on both \(X\) and \(Y\), providing a clearer picture of their direct relationship.
Applied Example Using fpp3
Data
Consider the global_economy
dataset from the fpp3
package, which contains economic indicators for various countries over time. We can explore relationships such as:
- Covariance and correlation between Growth and CPI for different countries.
- Partial correlation to understand the relationship between Imports and Exports while controlling for GDP.
# Filter data for Australia and select relevant variables
<- global_economy %>%
aus_data 1filter(Country == "Australia") %>%
2::select(Year, GDP, Growth, Population, CPI, Imports, Exports) dplyr
- 1
-
Filter the
global_economy
dataset to include only rows where the country is Australia. - 2
- Select the variables of interest: Year, Growth, Population, and CPI
# Calculate covariance between GDP and Population for Australia
1<- cov(aus_data$Growth, aus_data$CPI, use = "complete.obs")
cov_gdp_pop cov_gdp_pop
- 1
- Compute the covariance between Growth and CPI for the Australian subset, using complete observations.
[1] -17.36472
# Calculate correlation between GDP and Population for Australia
1<- cor(aus_data$Growth, aus_data$CPI, use = "complete.obs")
cor_gdp_pop cor_gdp_pop
- 1
- Compute the Pearson correlation coefficient between GDP and Population.
[1] -0.2779003
# Compute correlation matrix for GDP, Population, and Life Expectancy
<- aus_data %>%
cor_matrix 1::select(GDP, Growth, Population, CPI, Imports, Exports) %>%
dplyr2cor(use = "complete.obs")
cor_matrix
- 1
- Select columns for GDP, Growth, Population, CPI, Imports, and Exports from the Australian data.
- 2
- Calculate the correlation matrix for these variables using complete observations.
GDP Growth Population CPI Imports Exports
GDP 1.0000000 -0.2494090 0.9073355 0.9003771 0.7831384 0.8128970
Growth -0.2494090 1.0000000 -0.2950084 -0.2779003 -0.1753012 -0.2251961
Population 0.9073355 -0.2950084 1.0000000 0.9899235 0.9054921 0.9114850
CPI 0.9003771 -0.2779003 0.9899235 1.0000000 0.9313180 0.9275880
Imports 0.7831384 -0.1753012 0.9054921 0.9313180 1.0000000 0.9238266
Exports 0.8128970 -0.2251961 0.9114850 0.9275880 0.9238266 1.0000000
Year 0.8801466 -0.2863494 0.9966214 0.9906993 0.9224200 0.9195759
Year
GDP 0.8801466
Growth -0.2863494
Population 0.9966214
CPI 0.9906993
Imports 0.9224200
Exports 0.9195759
Year 1.0000000
- Partial correlation to understand the relationship between Imports and Exports while controlling for GDP.
# Partial correlation between Imports and Exports controlling for GDP manually
# Extract relevant correlation coefficients from the matrix
1<- cor_matrix["Imports", "Exports"]
rho_Imports_Exports 2<- cor_matrix["Imports", "GDP"]
rho_Imports_GDP 3<- cor_matrix["Exports", "GDP"] rho_Exports_GDP
- 1
- Extracts the direct correlation between Imports and Exports.
- 2
- Extracts the correlation between Imports and GDP.
- 3
- Extracts the correlation between Exports and GDP.
# Compute partial correlation between Imports and Exports controlling for GDP
<- (rho_Imports_Exports - rho_Imports_Exports * rho_Exports_GDP) /
pcorr_Imp_Exp_GDP sqrt((1 - rho_Imports_GDP^2) * (1 - rho_Exports_GDP^2))
# Display partial correlation result pcorr_Imp_Exp_GDP
[1] 0.4772659
Lab Activity: Recreate the above analysis using the class activity template and answer the following question:
- Apply the partial correlation formula to compute the relationship between GDP and Exports, controlling for Population.