Spark作业运行时,报错java.io.IOException: Mkdirs failed to create directory file:/home/tmp/catalog/example/

时间:2024-04-03 11:22:15

       今天在公司集群上将自己打好的jar扔上去,以spark2-submit脚本提交作业,等到接近中午时候,回头去看,发现报错:

Job aborted due to stage failure: Task 10 in stage 6.0 failed 4 times, most recent failure: Lost task 10.3 in stage 6.0 (TID 123, feiwei01, executor 12): java.io.IOException: Mkdirs failed to create directory file:/home/tmp/catalog/example/20/part-r-00010-520256382734

Spark作业运行时,报错java.io.IOException: Mkdirs failed to create directory file:/home/tmp/catalog/example/

集群中显示,如图:

Spark作业运行时,报错java.io.IOException: Mkdirs failed to create directory file:/home/tmp/catalog/example/

在到web页面仔细查看,如图:

Spark作业运行时,报错java.io.IOException: Mkdirs failed to create directory file:/home/tmp/catalog/example/

第一感觉是创建文件权限不够,但是还不确定,因为本次提交作业命令如下:

spark2-submit \
--class geotrellis.spark.etl.MultibandIngest \
--master yarn \
--deploy-mode client \
--num-executors 10 \
--executor-memory 2g \
--executor-cores 1 \
--driver-memory 1g \
--conf spark.default.parallelism=30
/home/tmp/geotrellis-spark-etl-assembly-2.0.0-M2.jar
--input "file:///home/tmp/json/input.json" \
--output "file:///home/tmp/json/output.json" \
--backend-profiles "file:///home/tmp/json/backend-profiles.json"

在此之前,采用-master local【*】模式本地测试,提交作业命令如下:

spark-submit --master  spark://spark:7077 --class geotrellis.spark.etl.MultibandIngest \
   --master 'local[*]' \
   --driver-memory 2G \
   /tmp/geotrellis-spark-etl-assembly-2.0.0-M2.jar \
   --input "file:///root/tmp/json/input.json" \
   --output "file:///root/tmp/json/output.json" \
   --backend-profiles "file:///root/tmp/json/backend-profiles.json"

多次测试,并没有报任何错误,所以还是有点矛盾。

在修改了tmp文件权限后,确实作业成功运行。目前,推测在

geotrellis.spark.io.hadoop.HadoopRDDWriter$MultiMapWriter.getWriter部分代码中有权检查,后续等到本人参悟源码后,再做介绍。