Temporal Context for LightGBM¶

Recommendation: Add Temporal Features (Do Not Change Model Architecture)¶

Using LightGBM (tabular, window-based) is a valid and efficient choice. You do not need to switch to LSTM or Transformers if you enrich each window with temporal context via extra features.

Strategy: Sliding-Window Context Features¶

Idea: Instead of feeding the model a raw sequence, we give it summary features that describe “what happened in this window and how it relates to the previous ones.”

Before (isolated window): Each row = features of one window only → the model does not know what happened before.
After (window + context): Each row = features of the current window plus features from the previous window(s) and deltas/trends between windows.

So the model still sees a single row per window (tabular), but that row carries temporal information: level in the previous window, change from previous to current, and simple trend (e.g. “increasing”, “stable”, “decreasing”).

What to Add¶

Previous-window features
For each key quantity (e.g. PT mean, GT mean, entropy), add the same quantity computed on the previous window: e.g. PT_mean_prev, GT_mean_prev.
Delta features
PT_mean_delta = PT_mean_current - PT_mean_prev
PT_mean_delta_pct = (current - prev) / prev
PT_std_ratio = current_std / prev_std (captures increase in variability, typical in leaks).
Trend features (optional)
If you have the last 3 windows:
Simple trend: “increasing”, “stable”, “decreasing” (e.g. from linear slope or comparison of means).
stability_duration: number of consecutive windows where the signal stayed “stable” (e.g. within a band).

This way the model can learn rules like: “if pressure dropped and variability increased compared to the previous window, treat as leak.”

Implementation Outline¶

Extractor: When building batches, for each window load the current window and the previous window(s) (same source/case, ordered by time or index).
Transformer: For each batch, compute current-window features as today, then add:
Features from the previous window(s) (e.g. same stats, prefixed with prev_).
Deltas and ratios (current vs prev).
Optional trend/stability from the last 2–3 windows.
Output: One row per window, but with more columns (current + prev + deltas + trend). LightGBM then trains on this enriched table.

No change to the training loop or to the model type—only to the feature set.

Why This Fits the Platform¶

Consistent with operational context: Temporal context (previous level, deltas, trend) helps separate “slow operational drift” from “sudden leak onset.”
Efficient: No sequence model; same fast training and deployment as today.
Explainable: Features have clear meaning (previous mean, change, ratio, trend).
Configurable: Number of previous windows and which stats to carry can be controlled in the pipeline config and feature code.

This document captures the temporal-context philosophy; the exact parameter names and formulas are in the feature extraction and pipeline configuration.