Activity39

# Time Series Essentials, install if needed!
library(feasts)       # Feature extraction & decomposition
library(fable)        # Forecasting models (ARIMA, ETS, etc.)
library(fpp3)         # Tidy time series dataseta
library(astsa)        # Applied statistical TS methods from textbook
library(tseries)      # Unit root tests & TS diagnostics
library(tsibbledata)  # Curated TS datasets
library(quantmod)     # Financial data retrieval
library(tidyquant)    # Financial analysis in tidyverse
library(purrr)        # Functional programming for TS pipelines
library(readr)        # Efficient data import
library(lubridate)
library(zoo)
library(hms)
library(stringr)
library(janitor)

Part 1: Parsing Inconsistent Character Dates

Your task is to use parse_date_time() to convert them into a proper POSIXct column. Consider the formats: ymd HMS, mdy HMS, B d, Y HM, etc.

Key Question: Which patterns did lubridate handle automatically? Which formats needed manual specification?

Part 2: Constructing a Date from Components

Suppose you are given columns: year = 2022, month = 11, day = 20.

Task: Combine them into a single date column using make_date() or ymd(). Here’s a more comprehensive data/tibble:

Why is this useful? Many real-world datasets (especially Kaggle CSVs) separate date parts due to scraping.

Part 3: Real data

You can download the Air Quality dataset from the UCI repository here: Air Quality Data Set to your working directory for this section. First, we clean the data by parsing dates/times and combining them into a proper datetime column, then convert text-based numbers to numeric values. Next, we aggregate the data into daily averages.

Problem Set 1: Parsing Heterogeneous Timestamps

Data Task:

  1. Code: Use parse_date_time() with orders to handle all formats. Which required explicit format codes?

Problem Set 2: Component → Temporal Index

Data Task:

  1. Code: Use make_date() to create full dates. Handle text months with match(mth, month.name).

Problem Set 3: Gap Imputation

Data Task:

  1. Code: Use fill_gaps(.full = TRUE) + fill(value, .direction = "down"). Why use .full?

Tsibble Transition