Scheduler Configuration Reference¶
Overview¶
The ETL scheduler is configured via a YAML file that contains a pipeline section (same structure as in pipelines_config.yml) and a scheduler section for timing and logging.
Scheduler section¶
| Key | Description |
|---|---|
mode |
interval | cron | daily | multiple_daily |
interval_seconds |
For interval: run every N seconds. |
cron_expression |
For cron: e.g. 0 */6 * * * (every 6 hours). |
daily_time |
For daily: single time of day (e.g. "02:00"). |
daily_times |
For multiple_daily: list of times (e.g. ["02:00", "14:00"]). |
timezone |
Timezone for cron/daily (e.g. UTC). |
logging |
level, file, format, max_size_mb, backup_count. |
Pipeline section¶
Copy the same keys you use for that pipeline in pipelines_config.yml (e.g. for TPL/GENKEY: source_folder, output_folder, selected_columns, batch_size, max_workers, etc.). The scheduler passes this section to the pipeline runner.
Example configs¶
The repo often includes example files such as:
etl_scheduler_config.yaml— default (e.g. interval).etl_scheduler_interval.yaml— interval mode.etl_scheduler_daily.yaml— daily at one time.etl_scheduler_cron.yaml— cron expression.
Use them as templates and point the scheduler at the chosen file with --config. For running and signals, see ETL scheduler.