Extend the NYSE example by including lagged volume values.
Visualize relationships between the current volume, its lagged values, and Fourier terms.
Step-by-Step Tutorial
1. Data Preparation & Lag Creation
# Get NYSE Composite index data (volume) from tidyquant and convert to tsibblenyse <-tq_get("^NYA", from ="2021-01-01", to =Sys.Date(), get ="stock.prices") %>%as_tsibble(index = date) %>%fill_gaps() %>%mutate(volume =na.approx(volume),adjusted =na.approx(adjusted),logVolume =log(volume)) %>%select(date, volume, logVolume, adjusted)# Plot volume seriesnyse %>%autoplot(volume) +labs(title ="NYSE Composite Index Volume", y ="Volume", x ="Date")
2. Regression with Lagged Variables & Fourier Terms
# Fit a model including Fourier terms and lagged volume predictorsnyse_lag_model <- nyse %>%model(LagModel =TSLM(volume ~fourier(K =2) + volume_lag1 + volume_lag2) )report(nyse_lag_model) %>%glance(nyse_lag_model) %>% knitr::kable()
# Residual diagnosticsnyse_lag_model %>%gg_tsresiduals() +labs(title ="Residual Diagnostics: NYSE Lag Model")
3. Multivariate EDA: ggpairs Plot
# Prepare a multivariate dataset for ggpairs visualizationnyse_multi <- nyse %>%mutate(trend =row_number()) %>%select(trend, volume, volume_lag1, volume_lag2) %>%as_tibble()library(GGally)ggpairs(nyse_multi, lower =list(continuous =wrap("smooth", alpha =0.3)),title ="Multivariate Relationships with Trend Component")
Lab Activity
Task: Experiment with including additional lags (e.g., lag 3 or lag 4) and higher order Fourier Harmonics in your model. Use ggpairs to visualize how these lagged variables relate to the current volume and discuss potential multicollinearity issues.