Skip to content

Engineering

Philosophy, Design, and Operational Rigor

How we reason about leak detection, validation, and production pipelines


Purpose of This Section

The Engineering section explains the operational and engineering philosophy behind the MLOps platform: how we distinguish leaks from operational changes without extra sensors, how we use temporal and operational context in models, how we validate and avoid overfitting, and how we design ETL and scheduling for production.

This is the place for engineers and scientists who want to understand the “why” and the design trade-offs, not only the “how” of running a script.


How This Section Is Organized

Area What you will find
Context & philosophy Operational context — Distinguishing leaks from pump/valve changes using only PT/GT. Temporal context — Adding sliding-window and trend features so LightGBM sees “what happened before”.
Detection & features Leak detection features — Why oscillation, correlation, and topology matter; which features separate leak vs operational transients. Early detection — Detecting leaks as early as possible in the transient. World-class analysis — End-to-end analysis and metrics.
Architecture ETL architecture — SOLID principles, Extract–Transform–Load design, and pipeline types. Architecture diagram — Visual overview of components.
Validation & robustness Model validation — Configurable validation scripts and thresholds. Train/validation split by case — Why we split by case_id and how we avoid leakage. Validation method and examples — Method 3 and worked examples. Overfitting analysis and Reducing overfitting — How we detect and mitigate overfitting.
Scheduling & operations ETL scheduler — Modes, idempotency, and graceful shutdown. Scheduler configs — YAML options and examples. Prefect integration — Deploying flows and runs to production.
Pipelines (deep dive) TPL/GENKEY pipeline, Windows pipeline, Features pipeline, Transformer pipeline, Training pipeline — Detailed behavior and configuration.
Reference Feature extraction, Hyperparameter optimization, OBSERVER pipelines, ETL config, Test offline pipeline, Changelog — Reference and history.

Core Ideas in One Page

  1. No extra sensors — We infer operational state and transient type from pressure (PT) and flow (GT) using oscillation, correlation, and baseline-deviation features.
  2. Temporal context in tabular form — We feed LightGBM with features from the current window plus previous windows and deltas/trends, instead of switching to sequence models.
  3. Split by case, not by window — Train/validation splits are by case_id so the same physical case never appears in both; this avoids leakage and gives a realistic performance estimate.
  4. Idempotency everywhere — Pipelines and scheduler runs are safe to re-run; we use config hashes and “already processed” checks so only new work runs.
  5. Production-ready scheduling — APScheduler for simple cron/interval runs; Prefect for observability, retries, and deployment.

Use the menu on the left to go to each topic in detail.