Windows Pipeline (Engineering Deep Dive)¶
Role¶
Builds fixed-size time windows from Parquet time series (typically TPL/GENKEY output). Each window becomes one Parquet file (or one row in a downstream aggregate). Already-processed inputs are skipped.
Engineering Notes¶
- Window and step:
window_size(rows per window) andstep_sizecontrol temporal resolution and overlap. - By-case idempotency: The extractor checks existing outputs and skips source files that already have corresponding windows.
- Downstream: Windows feed the Features pipeline and thus all training and evaluation.
Full Reference¶
Configuration keys, script name, and behavior: Windows in the main Pipelines section.