Skip to content

Hyperparameter Optimization (Engineering View)

Role

Optuna is used to search LightGBM hyperparameters (and optionally feature set) for detection, multiclass, or regression pipelines. The objective is validation metric (or cross-validation over cases), so the chosen configuration tends to generalize.


Why It Matters for Engineering

  • Generalization: Tuning on validation (with by-case split) reduces overfitting (see Overfitting analysis and Reducing overfitting).
  • Consistency: The same data loading, schema, and split logic as in training ensure that the optimized params are directly usable in the training scripts.
  • Optional feature selection: When “pre–feature selection” is enabled, Optuna runs after a feature-selection run so the search space is over a reduced schema.

Configuration and Pipelines

  • Config: Same section as the corresponding training pipeline; plus n_trials, cv_folds, timeout, enable_feature_selection_before_tuning, and constraints (e.g. min_recall).
  • Script: run_optimize_lgbm_hyperparameters.py.
  • Detailed reference: See Hyperparameter optimization in the main Pipelines section.

This Engineering page gives the rationale; the full how-to and options are in the Pipelines docs.