气流:重试到指定时间

问题描述:

我需要创建一个绝对要在9h之前运行的Airflow作业。

I need to create an Airflow job that needs to run absolutely before 9h.

我目前有一个工作在7h开始,重试= 8,需15分钟间隔(8 * 15m = 2h),不幸的是,我的工作花费了更多时间,因此,任务在艰难的截止日期9h之后失败了。

I currently have a job that starts at 7h, with retries=8 with 15 minutes interval (8*15m=2h) unfortunately, my job takes more time, and due to this, the task fails after 9h that is the hard deadline.

使其每隔15分钟重试一次,但如果在9h之后失败,则人可以查看问题?

How can I make it do retry every 15 minutes but fail if it's after 9h so a human can take a look at the issue ?

感谢您的帮助

在创建任务时,可以使用execution_timeout参数来控制任务在超时之前将运行多长时间。因此,如果您在7AM运行任务,并希望在9AM结束任务,则将超时设置为2小时。以下是Airflow文档中的信息

You could use the execution_timeout argument when creating the task to control how long it'll run before timing out. So if you run your task at 7AM, and want it to end at 9AM, then set the timeout to 2 hours. Below is info from Airflow documentation

aggregate_db_message_job = BashOperator(
    task_id='aggregate_db_message_job',
    execution_timeout=timedelta(hours=2),
    pool='ep_data_pipeline_db_msg_agg',
    bash_command=aggregate_db_message_job_cmd,
    dag=dag)
aggregate_db_message_job.set_upstream(wait_for_empty_queue)