
时间:2021-11-28 15:31:22

I have an R analysis composed of three parts (partA, partB, and partC). I submit each part to SLURM (e.g. sbatch partA), and each part is parallelized via #SBATCH --array=1-1500. The parts are in serial, so I need to wait for one to finish before starting the next. Right now I'm manually starting each job, but that's not a great solution.

我有一个由三部分组成的R分析(partA, partB, partC)。我将每个部分提交给SLURM(例如sbatch partA),每个部分都是通过# sbatch(数组=1-1500)并行化的。零件是串行的,所以我需要等一个零件完成后再开始下一个。现在我手动地开始每一项工作,但这不是一个好的解决方案。

I would like to automate the three sbatch calls. For example:


  1. sbatch partA
  2. sbatch partA
  3. when partA is done, sbatch partB
  4. 当partA完成时,sbatch partB
  5. when partB is done, sbatch partC
  6. 当partB完成时,sbatch partC。

I used this solution to get the job ID of partA, and pass that to strigger to accomplish step 2 above. However I'm stuck at that point, because I don't know how to get the job ID of partB from strigger. Here's what my code looks like:



# step 1: sbatch partA
partA_ID=$(sbatch --parsable partA.sh)

# step 2: sbatch partB
strigger --set --jobid=$partA_ID --fini --program=/path/to/partB.batch

# step 3: sbatch partC
... ?

How do I complete step 3?


1 个解决方案



strigger is not the proper tool to achieve that goal, it is more aimed at administrators than regular users. Only slurm user can actually set triggers (see the "Important note" in the strigger manpage).

strigger并不是实现这个目标的合适工具,它更多的是针对管理员而不是普通用户。只有slurm用户可以设置触发器(请参阅strigger manpage中的“重要提示”)。

In your case, you should submit all three jobs at once, with dependencies set among them.


For instance:


$ partA_ID=$(sbatch --parsable partA.sh)
$ partB_ID=$(sbatch --parsable --dependency=afterany:${partA_ID} partB.sh)
$ partC_ID=$(sbatch --parsable --dependency=afterany:${partB_ID} partC.sh)

This will submit three job arrays but the second one will only start when all jobs in the first one have finished. And the third one will only start when all jobs in the second one have finished.


An alternative can be


$ partA_ID=$(sbatch --parsable partA.sh)
$ partB_ID=$(sbatch --parsable --dependency=aftercorr:${partA_ID}  partB.sh)
$ partC_ID=$(sbatch --parsable --dependency=aftercorr:${partB_ID}  partC.sh)

This will submit three job arrays, but the all jobs of the second one will not start until the corresponding job in the first one (i.e. job that has the same $SLURM_ARRAY_TASK_ID) has finished. And all jobs in the third one will start only when the corresponding job in the second one have finished.


For more details, see the --dependency section in the sbatch manpage.

有关详细信息,请参阅sbatch manpage中的-dependency部分。



strigger is not the proper tool to achieve that goal, it is more aimed at administrators than regular users. Only slurm user can actually set triggers (see the "Important note" in the strigger manpage).

strigger并不是实现这个目标的合适工具,它更多的是针对管理员而不是普通用户。只有slurm用户可以设置触发器(请参阅strigger manpage中的“重要提示”)。

In your case, you should submit all three jobs at once, with dependencies set among them.


For instance:


$ partA_ID=$(sbatch --parsable partA.sh)
$ partB_ID=$(sbatch --parsable --dependency=afterany:${partA_ID} partB.sh)
$ partC_ID=$(sbatch --parsable --dependency=afterany:${partB_ID} partC.sh)

This will submit three job arrays but the second one will only start when all jobs in the first one have finished. And the third one will only start when all jobs in the second one have finished.


An alternative can be


$ partA_ID=$(sbatch --parsable partA.sh)
$ partB_ID=$(sbatch --parsable --dependency=aftercorr:${partA_ID}  partB.sh)
$ partC_ID=$(sbatch --parsable --dependency=aftercorr:${partB_ID}  partC.sh)

This will submit three job arrays, but the all jobs of the second one will not start until the corresponding job in the first one (i.e. job that has the same $SLURM_ARRAY_TASK_ID) has finished. And all jobs in the third one will start only when the corresponding job in the second one have finished.


For more details, see the --dependency section in the sbatch manpage.

有关详细信息,请参阅sbatch manpage中的-dependency部分。