Hyperparameter Optimization (Engineering View)¶
Role¶
Optuna is used to search LightGBM hyperparameters (and optionally feature set) for detection, multiclass, or regression pipelines. The objective is validation metric (or cross-validation over cases), so the chosen configuration tends to generalize.
Why It Matters for Engineering¶
- Generalization: Tuning on validation (with by-case split) reduces overfitting (see Overfitting analysis and Reducing overfitting).
- Consistency: The same data loading, schema, and split logic as in training ensure that the optimized params are directly usable in the training scripts.
- Optional feature selection: When “pre–feature selection” is enabled, Optuna runs after a feature-selection run so the search space is over a reduced schema.
Configuration and Pipelines¶
- Config: Same section as the corresponding training pipeline; plus
n_trials,cv_folds,timeout,enable_feature_selection_before_tuning, and constraints (e.g.min_recall). - Script:
run_optimize_lgbm_hyperparameters.py. - Detailed reference: See Hyperparameter optimization in the main Pipelines section.
This Engineering page gives the rationale; the full how-to and options are in the Pipelines docs.