使用.saveAsTable()将表保存到Hive Metastore,如何重新加载?

时间:2020-12-19 19:14:05

I used .saveAsTable on my DataFrame and now it is stored in my HDFS hive warehouse metastore. How can I load this back into Spark SQL? I have deleted my cluster (Azure HDInsight) and created a new one, confirmed my Hive metastore location is the same and the directory is still there.

我在我的DataFrame上使用了.saveAsTable,现在它存储在我的HDFS hive仓库Metastore中。如何将其加载回Spark SQL?我删除了我的群集(Azure HDInsight)并创建了一个新群集,确认我的Hive Metastore位置相同且目录仍在那里。

I need to load this again as a persistent table, not as a temp table as I am using the PowerBI/Spark connector. The only way I have found to do so far is to load the directory back into a DF, then run .saveAsTable again.. which is writing the file again and takes a long time to process. I'm hopeful there is a better way!!

我需要再次将其作为持久表加载,而不是像我使用PowerBI / Spark连接器那样作为临时表。到目前为止我发现的唯一方法是将目录加载回DF,然后再次运行.saveAsTable ..这是再次写入文件并需要很长时间才能处理。我希望有更好的方法!

1 个解决方案

#1


0  

After you use .saveAsTable you may query direcly with sql.

使用.saveAsTable后,您可以使用sql查询。

df.saveAsTable("tableName")
myOtherDf = sqlContext.sql("select * from tableName")

#1


0  

After you use .saveAsTable you may query direcly with sql.

使用.saveAsTable后,您可以使用sql查询。

df.saveAsTable("tableName")
myOtherDf = sqlContext.sql("select * from tableName")