Spark:从PostgreSQL中提取数据时出现奇怪的nullPointerException

时间:2022-10-14 15:37:46

I'm working with PostgreSQL 9.6 and Spark 2.0.0

我正在使用PostgreSQL 9.6和Spark 2.0.0

I want to create a DataFrame form a postgreSQL table, as following:

我想从postgreSQL表创建一个DataFrame,如下所示:

val query =
  """(
        SELECT events.event_facebook_id,
          places.placeid, places.likes as placelikes,
          artists.facebookId, artists.likes as artistlikes
          FROM events

        LEFT JOIN eventsplaces on eventsplaces.event_id = events.event_facebook_id
          LEFT JOIN places on eventsplaces.event_id = places.facebookid
        LEFT JOIN eventsartists on eventsartists.event_id = events.event_facebook_id
          LEFT JOIN artists on eventsartists.artistid = artists.facebookid) df"""

The request is valid (if I run it on psql, I don't get any error) but with Spark, if I execute the following code, I get a NullPointerException:

请求是有效的(如果我在psql上运行它,我没有得到任何错误)但使用Spark,如果我执行以下代码,我得到一个NullPointerException:

sqlContext
      .read
      .format("jdbc")
      .options(
        Map(
          "url" -> claudeDatabaseUrl,
          "dbtable" -> query))
      .load()
      .show()

If I change, in the query artists.facebookId by an other column as artists.description (which can be null contrary to facebookId), the exception disappears.

如果我在查询artists.facebookId中将其他列更改为artists.description(与facebookId相反可以为null),则异常消失。

I find this very very strange, any idea?

我发现这很奇怪,任何想法?

1 个解决方案

#1


0  

You have different facebookId's in your query: artists.facebook[I]d and artists.facebook[i]d.

你的查询中有不同的facebookId:artists.facebook [I] d和artists.facebook [i] d。

Please, try to use the correct one.

请尝试使用正确的。

#1


0  

You have different facebookId's in your query: artists.facebook[I]d and artists.facebook[i]d.

你的查询中有不同的facebookId:artists.facebook [I] d和artists.facebook [i] d。

Please, try to use the correct one.

请尝试使用正确的。