在oozie中使用hive上下文失败的Spark作业

时间:2022-04-28 20:49:15

In one of our pipelines we are doing aggregation using spark(java) and it is orchestrated using oozie. This pipelines writes the aggregated data to an ORC file using the following lines.

在我们的一个管道中,我们使用spark(java)进行聚合,并使用oozie进行编排。此管道使用以下行将聚合数据写入ORC文件。

HiveContext hc = new HiveContext(sc);
DataFrame modifiedFrame = hc.createDataFrame(aggregateddatainrdd, schema);

modifiedFrame.write().format("org.apache.spark.sql.hive.orc").partitionBy("partition_column_name").save(output);

When the spark action in the oozie job gets triggered it throws the following exception

当oozie作业中的spark动作被触发时,它会抛出以下异常

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, org.apache.hadoop.hive.shims.HadoopShims.isSecurityEnabled()Z java.lang.NoSuchMethodError: org.apache.hadoop.hive.shims.HadoopShims.isSecurityEnabled()Z

失败的Oozie Launcher,Main class [org.apache.oozie.action.hadoop.SparkMain],main()抛出异常,org.apache.hadoop.hive.shims.HadoopShims.isSecurityEnabled()Z java.lang.NoSuchMethodError:org。 apache.hadoop.hive.shims.HadoopShims.isSecurityEnabled()z

But the same is getting succeeded after rerunning the workflow multiple times.

但是,在多次重新运行工作流程后,同样成功。

All the necessary jars are in place both during run time and compile time.

所有必需的罐子都在运行时和编译时都有。

This is my first spark app and i am not able understand the issue.

这是我的第一个火花应用程序,我无法理解这个问题。

Could someone help me in understanding the issue better and possible solution for the same.

有人可以帮助我更好地理解问题,并为此做出可能的解决方案。

1 个解决方案

#1


0  

"the same is getting succeeded after rerunning the workflow multiple times"

“在多次重新运行工作流程后,同样成功”

Sounds like you have compiled / bundled your Spark job with a Hadoop client in a different version than the one running the cluster; as a result there are conflicting JARs in the CLASSPATH, and your job fails randomly depending on which JAR is picked up first.

听起来你已经用一个Hadoop客户端编译/捆绑了你的Spark作业,其版本与运行集群的版本不同;因此,CLASSPATH中存在冲突的JAR,并且您的作业会随机失败,具体取决于首先拾取的JAR。

To be sure, choose one Oozie job that succeeded and one job that failed, get the "external ID" of the action (which is labeled job_*******_**** but refers to the YARN ID application_******_****) and inspect the YARN logs for both jobs. You should see a differenece in the actual order of JARs in the Java CLASSPATH.

可以肯定的是,选择一个成功的Oozie作业和一个失败的作业,获取操作的“外部ID”(标记为作业_ ******* _ ****但是指的是YARN ID application_ * ***** _ ****)并检查两个作业的YARN日志。您应该在Java CLASSPATH中看到JAR实际顺序的差异。

If that's indeed the case, then try a combination of

如果确实如此,那就尝试一下

  • in Oozie action, set property oozie.launcher.mapreduce.user.classpath.first to true (for the Spark driver)
  • 在Oozie动作中,将属性oozie.launcher.mapreduce.user.classpath.first设置为true(对于Spark驱动程序)

  • in Spark config, set property spark.yarn.user.classpath.first to true (for the executors)
  • 在Spark配置中,将属性spark.yarn.user.classpath.first设置为true(对于执行程序)

You can guess what the user.classpath.first implies...!

你可以猜出user.classpath.first意味着什么......!


But, it might not work, if the conflicting JARs are actually not in the Hadoop client but in the Oozie ShareLib. And from YARN point of view, Oozie is the "client", you cannot set a precedence between what Oozie ships from its ShareLib and what it ships from your Spark job.

In that case you would have to use the proper dependencies in your Java project, and match the Hadoop version you will be running against -- that's just common sense, don't you think?!?

在这种情况下,您必须在Java项目中使用正确的依赖项,并匹配您将要运行的Hadoop版本 - 这只是常识,您不觉得吗?!?

#1


0  

"the same is getting succeeded after rerunning the workflow multiple times"

“在多次重新运行工作流程后,同样成功”

Sounds like you have compiled / bundled your Spark job with a Hadoop client in a different version than the one running the cluster; as a result there are conflicting JARs in the CLASSPATH, and your job fails randomly depending on which JAR is picked up first.

听起来你已经用一个Hadoop客户端编译/捆绑了你的Spark作业,其版本与运行集群的版本不同;因此,CLASSPATH中存在冲突的JAR,并且您的作业会随机失败,具体取决于首先拾取的JAR。

To be sure, choose one Oozie job that succeeded and one job that failed, get the "external ID" of the action (which is labeled job_*******_**** but refers to the YARN ID application_******_****) and inspect the YARN logs for both jobs. You should see a differenece in the actual order of JARs in the Java CLASSPATH.

可以肯定的是,选择一个成功的Oozie作业和一个失败的作业,获取操作的“外部ID”(标记为作业_ ******* _ ****但是指的是YARN ID application_ * ***** _ ****)并检查两个作业的YARN日志。您应该在Java CLASSPATH中看到JAR实际顺序的差异。

If that's indeed the case, then try a combination of

如果确实如此,那就尝试一下

  • in Oozie action, set property oozie.launcher.mapreduce.user.classpath.first to true (for the Spark driver)
  • 在Oozie动作中,将属性oozie.launcher.mapreduce.user.classpath.first设置为true(对于Spark驱动程序)

  • in Spark config, set property spark.yarn.user.classpath.first to true (for the executors)
  • 在Spark配置中,将属性spark.yarn.user.classpath.first设置为true(对于执行程序)

You can guess what the user.classpath.first implies...!

你可以猜出user.classpath.first意味着什么......!


But, it might not work, if the conflicting JARs are actually not in the Hadoop client but in the Oozie ShareLib. And from YARN point of view, Oozie is the "client", you cannot set a precedence between what Oozie ships from its ShareLib and what it ships from your Spark job.

In that case you would have to use the proper dependencies in your Java project, and match the Hadoop version you will be running against -- that's just common sense, don't you think?!?

在这种情况下,您必须在Java项目中使用正确的依赖项,并匹配您将要运行的Hadoop版本 - 这只是常识,您不觉得吗?!?