Skip to content

Windows Pipeline (Engineering Deep Dive)

Role

Builds fixed-size time windows from Parquet time series (typically TPL/GENKEY output). Each window becomes one Parquet file (or one row in a downstream aggregate). Already-processed inputs are skipped.


Engineering Notes

  • Window and step: window_size (rows per window) and step_size control temporal resolution and overlap.
  • By-case idempotency: The extractor checks existing outputs and skips source files that already have corresponding windows.
  • Downstream: Windows feed the Features pipeline and thus all training and evaluation.

Full Reference

Configuration keys, script name, and behavior: Windows in the main Pipelines section.