Technology Stack
This page describes the core technologies used by the MLOps Platform.
Languages & Runtimes
| Technology |
Version |
Role |
| Python |
3.10+ |
Primary language for all pipelines, ETL, and ML code. |
| Async I/O |
asyncio |
Used in ETL pipelines, scheduler, and Prefect flows for non-blocking execution. |
Data & Storage
| Technology |
Role |
| Parquet |
Primary columnar format for intermediate and training data (via PyArrow). |
| CSV |
Export format for reports and compatibility; configurable encoding and separator. |
| Local filesystem |
Default storage for source_folder, output_folder, and artifacts. |
| Amazon S3 |
Optional storage via s3:// URIs or storage.type: s3 with bucket and prefix; requires s3fs. |
ML & Numerical
| Technology |
Role |
| LightGBM |
Training for binary (detection), multiclass (size/location), and regression (eak flow). |
| scikit-learn |
Splits, metrics, preprocessing (e.g. scaling), and utilities. |
| NumPy |
Array operations and numerical foundations. |
| Pandas |
DataFrames for ETL, features, and training data. |
| PyArrow |
Parquet read/write and efficient in-memory representation. |
| Optuna |
Hyperparameter optimization with configurable objectives and pruning. |
| PyWavelets |
Wavelet transforms for the features pipeline. |
| SciPy |
Scientific utilities used in feature extraction and signal processing. |
Orchestration & Scheduling
| Technology |
Role |
| APScheduler |
In-process scheduler for interval, cron, and daily runs (etl_scheduler.py). |
| Prefect 3 |
Flow orchestration, deployments, and observability for production. |
| Docker |
Optional: run Prefect server, worker, and PostgreSQL via docker-compose.prefect.yml. |
| Technology |
Role |
| YAML |
Main format for pipelines_config.yml, ETL scheduler config, and Prefect. |
| JSON |
Schemas, metadata, and some config overrides. |
| PyYAML |
Loading and parsing of YAML configuration files. |
Development & Quality
| Technology |
Role |
| pytest |
Unit and integration tests. |
| Black |
Code formatting. |
| isort |
Import sorting. |
| flake8 / mypy |
Linting and optional type checking. |
| pre-commit |
Git hooks for format and lint. |
- s3fs — S3 support; install with
pip install .[s3].
- Jupyter / JupyterLab — Optional for notebooks; install with
pip install .[jupyter].
- MkDocs + Material — Documentation build; install with
pip install .[docs].