@monthly cron工作不可靠

时间:2022-09-15 16:34:15

Our customer wants us to create a report every month.

我们的客户希望我们每月创建一份报告。

In the past, we used a @monthly cron job for this task.

在过去,我们使用@monthly cron作业完成此任务。

But this is not reliable:

但这不可靠:

  1. The server could be down in this minute. Cron does not re-run those jobs
  2. 服务器可能会在这一分钟内关闭。 Cron没有重新开始这些工作
  3. If the server is up, the database could be unreachable in this moment.
  4. 如果服务器已启动,则此时数据库可能无法访问。
  5. If the server is up and DB is up, there could be a third party system which is not reachable
  6. 如果服务器已启动且数据库已启动,则可能存在无法访问的第三方系统
  7. There could be a software bug.
  8. 可能存在软件错误。

What can I do, to be sure that the report gets created monthly?

我能做些什么,以确保每月创建报告?

It is a Django based web application

它是一个基于Django的Web应用程序

5 个解决方案

#1


1  

You should write some script that will test conditions and perform all required operations.

您应该编写一些脚本来测试条件并执行所有必需的操作。

if is_work_finished_less_then_month_ago():
    return
else:
    try:
        generate_normal_report()
    except some_error as e:
        report_about_error(e)

Then run it every hour or day.

然后每小时或每天运行它。

If you afraid of too many error_reports then do the same thing in report_about_error() method: check last time you sent report and do not send it if it's too often.

如果你害怕过多的error_reports,那么在report_about_error()方法中做同样的事情:检查你上次发送报告的时间,如果经常发生则不要发送。

#2


6  

Use a decent scheduler

使用一个体面的调度程序

celery beat is a scheduler; It kicks off tasks at regular intervals, that are then executed by available worker nodes in the cluster.

芹菜节拍是一种调度程序;它定期启动任务,然后由群集中的可用工作节点执行。

You create a periodic task with the report function job. If the job fails celery will retry following the retry policy that you have set.

您可以使用报告功能作业创建定期任务。如果作业失败,芹菜将根据您设置的重试策略重试。

Celery doc - Periodic Tasks

芹菜文件 - 定期任务

#3


1  

Too many moving pieces and consequently options to consider. But problem 1 means you need some form of external way to track success (otherwise one option would have been for your server task - say a bash script - to keep retrying N times and sleeping in between retries, until the report generation is successful).

移动件太多,因此需要考虑选项。但问题1意味着您需要某种形式的外部方式来跟踪成功(否则一个选项可能是您的服务器任务 - 比如一个bash脚本 - 继续重试N次并在重试之间休眠,直到报告生成成功)。

If you want a full-blown solution you can use for many different future needs, you can look into the available third party schedulers like Jenkins or SOS Berlin.

如果您想要一个完整的解决方案,您可以用于许多不同的未来需求,您可以查看可用的第三方调度程序,如Jenkins或SOS Berlin。

If you are looking for a simpler solution, you can schedule the report script via cron to run many times (say every hour for a few days at the end of the month), then have it keep track of whether the report was generated and sent successfully (this could be as simple as creating a file and checking for its existence, or writing a value to a database).

如果您正在寻找一个更简单的解决方案,您可以通过cron安排报告脚本多次运行(比如月末的每小时一天),然后让它跟踪报告是否生成和发送成功(这可能就像创建文件并检查其存在或将值写入数据库一样简单)。

#4


1  

The easy way is that you can write an helper such that the script checks for the availability of the report first and generates the report only if the report is unavailable.

简单的方法是,您可以编写帮助程序,以便脚本首先检查报告的可用性,并仅在报告不可用时生成报告。

Then schedule the script to run every hour on that day of the month.

然后安排脚本在该月的那一天每小时运行一次。

change your script as follows:

更改脚本如下:

if [[ -f <report_name> ]]
then
    echo "report exists" 
    exit 1
else
    echo "run generate report script"

crontab entry (To run every hour on 28th of every month):

crontab条目(每个月28日每小时运行一次):

0 0-23 28 * * <name_of_helper_script>

#5


1  

cron is not able to do what you're asking for. And kicking off cron-like tasks via your Django app is not what Django was created for, and will only work if you handle all the edge cases and make sure to pick up missed runs if your app is malfunctioning, etc. It will be a rabbit-hole of error handling, state management and concurrency considerations.

cron无法满足您的要求。通过Django应用程序启动类似cron的任务并不是Django创建的任务,并且只有在处理所有边缘情况时才会起作用,并且如果你的应用程序出现故障,请务必选择错过的运行等。这将是一个错误处理,状态管理和并发考虑的漏洞。

I would suggest one of two options:

我建议两种选择之一:

  • Kick off the job with cron anyway, but write a Nagios check to ensure that you're notified if the report wasn't generated for some reason (and handle those rare cases manually).
  • 无论如何,用cron开始工作,但写一个Nagios检查,以确保如果由于某种原因没有生成报告(并手动处理这些罕见的情况),您会收到通知。
  • Use Airflow or any other framework that was created specifically for the purpose of handling scheduling, monitoring, retrying and logging of scheduled tasks.
  • 使用Airflow或专门为处理计划任务的计划,监视,重试和日志记录而创建的任何其他框架。

The former is what you want if this is really a one-off job. The latter is better if you have more tasks like this coming your way.

如果这真的是一次性的工作,前者就是你想要的。如果您有更多这样的任务,那么后者会更好。

#1


1  

You should write some script that will test conditions and perform all required operations.

您应该编写一些脚本来测试条件并执行所有必需的操作。

if is_work_finished_less_then_month_ago():
    return
else:
    try:
        generate_normal_report()
    except some_error as e:
        report_about_error(e)

Then run it every hour or day.

然后每小时或每天运行它。

If you afraid of too many error_reports then do the same thing in report_about_error() method: check last time you sent report and do not send it if it's too often.

如果你害怕过多的error_reports,那么在report_about_error()方法中做同样的事情:检查你上次发送报告的时间,如果经常发生则不要发送。

#2


6  

Use a decent scheduler

使用一个体面的调度程序

celery beat is a scheduler; It kicks off tasks at regular intervals, that are then executed by available worker nodes in the cluster.

芹菜节拍是一种调度程序;它定期启动任务,然后由群集中的可用工作节点执行。

You create a periodic task with the report function job. If the job fails celery will retry following the retry policy that you have set.

您可以使用报告功能作业创建定期任务。如果作业失败,芹菜将根据您设置的重试策略重试。

Celery doc - Periodic Tasks

芹菜文件 - 定期任务

#3


1  

Too many moving pieces and consequently options to consider. But problem 1 means you need some form of external way to track success (otherwise one option would have been for your server task - say a bash script - to keep retrying N times and sleeping in between retries, until the report generation is successful).

移动件太多,因此需要考虑选项。但问题1意味着您需要某种形式的外部方式来跟踪成功(否则一个选项可能是您的服务器任务 - 比如一个bash脚本 - 继续重试N次并在重试之间休眠,直到报告生成成功)。

If you want a full-blown solution you can use for many different future needs, you can look into the available third party schedulers like Jenkins or SOS Berlin.

如果您想要一个完整的解决方案,您可以用于许多不同的未来需求,您可以查看可用的第三方调度程序,如Jenkins或SOS Berlin。

If you are looking for a simpler solution, you can schedule the report script via cron to run many times (say every hour for a few days at the end of the month), then have it keep track of whether the report was generated and sent successfully (this could be as simple as creating a file and checking for its existence, or writing a value to a database).

如果您正在寻找一个更简单的解决方案,您可以通过cron安排报告脚本多次运行(比如月末的每小时一天),然后让它跟踪报告是否生成和发送成功(这可能就像创建文件并检查其存在或将值写入数据库一样简单)。

#4


1  

The easy way is that you can write an helper such that the script checks for the availability of the report first and generates the report only if the report is unavailable.

简单的方法是,您可以编写帮助程序,以便脚本首先检查报告的可用性,并仅在报告不可用时生成报告。

Then schedule the script to run every hour on that day of the month.

然后安排脚本在该月的那一天每小时运行一次。

change your script as follows:

更改脚本如下:

if [[ -f <report_name> ]]
then
    echo "report exists" 
    exit 1
else
    echo "run generate report script"

crontab entry (To run every hour on 28th of every month):

crontab条目(每个月28日每小时运行一次):

0 0-23 28 * * <name_of_helper_script>

#5


1  

cron is not able to do what you're asking for. And kicking off cron-like tasks via your Django app is not what Django was created for, and will only work if you handle all the edge cases and make sure to pick up missed runs if your app is malfunctioning, etc. It will be a rabbit-hole of error handling, state management and concurrency considerations.

cron无法满足您的要求。通过Django应用程序启动类似cron的任务并不是Django创建的任务,并且只有在处理所有边缘情况时才会起作用,并且如果你的应用程序出现故障,请务必选择错过的运行等。这将是一个错误处理,状态管理和并发考虑的漏洞。

I would suggest one of two options:

我建议两种选择之一:

  • Kick off the job with cron anyway, but write a Nagios check to ensure that you're notified if the report wasn't generated for some reason (and handle those rare cases manually).
  • 无论如何,用cron开始工作,但写一个Nagios检查,以确保如果由于某种原因没有生成报告(并手动处理这些罕见的情况),您会收到通知。
  • Use Airflow or any other framework that was created specifically for the purpose of handling scheduling, monitoring, retrying and logging of scheduled tasks.
  • 使用Airflow或专门为处理计划任务的计划,监视,重试和日志记录而创建的任何其他框架。

The former is what you want if this is really a one-off job. The latter is better if you have more tasks like this coming your way.

如果这真的是一次性的工作,前者就是你想要的。如果您有更多这样的任务,那么后者会更好。