Skip to content

Temporal Context for LightGBM

Recommendation: Add Temporal Features (Do Not Change Model Architecture)

Using LightGBM (tabular, window-based) is a valid and efficient choice. You do not need to switch to LSTM or Transformers if you enrich each window with temporal context via extra features.


Strategy: Sliding-Window Context Features

Idea: Instead of feeding the model a raw sequence, we give it summary features that describe “what happened in this window and how it relates to the previous ones.”

  • Before (isolated window): Each row = features of one window only → the model does not know what happened before.
  • After (window + context): Each row = features of the current window plus features from the previous window(s) and deltas/trends between windows.

So the model still sees a single row per window (tabular), but that row carries temporal information: level in the previous window, change from previous to current, and simple trend (e.g. “increasing”, “stable”, “decreasing”).


What to Add

  1. Previous-window features
    For each key quantity (e.g. PT mean, GT mean, entropy), add the same quantity computed on the previous window: e.g. PT_mean_prev, GT_mean_prev.

  2. Delta features

  3. PT_mean_delta = PT_mean_current - PT_mean_prev
  4. PT_mean_delta_pct = (current - prev) / prev
  5. PT_std_ratio = current_std / prev_std (captures increase in variability, typical in leaks).

  6. Trend features (optional)
    If you have the last 3 windows:

  7. Simple trend: “increasing”, “stable”, “decreasing” (e.g. from linear slope or comparison of means).
  8. stability_duration: number of consecutive windows where the signal stayed “stable” (e.g. within a band).

This way the model can learn rules like: “if pressure dropped and variability increased compared to the previous window, treat as leak.”


Implementation Outline

  • Extractor: When building batches, for each window load the current window and the previous window(s) (same source/case, ordered by time or index).
  • Transformer: For each batch, compute current-window features as today, then add:
  • Features from the previous window(s) (e.g. same stats, prefixed with prev_).
  • Deltas and ratios (current vs prev).
  • Optional trend/stability from the last 2–3 windows.
  • Output: One row per window, but with more columns (current + prev + deltas + trend). LightGBM then trains on this enriched table.

No change to the training loop or to the model type—only to the feature set.


Why This Fits the Platform

  • Consistent with operational context: Temporal context (previous level, deltas, trend) helps separate “slow operational drift” from “sudden leak onset.”
  • Efficient: No sequence model; same fast training and deployment as today.
  • Explainable: Features have clear meaning (previous mean, change, ratio, trend).
  • Configurable: Number of previous windows and which stats to carry can be controlled in the pipeline config and feature code.

This document captures the temporal-context philosophy; the exact parameter names and formulas are in the feature extraction and pipeline configuration.