Google Dataflow作业和BigQuery在不同地区失败

时间:2021-07-30 18:04:29

I have a Google Dataflow job that is failing on:

我有一个失败的Google Dataflow作业:

BigQuery job ... finished with error(s): errorResult: 
Cannot read and write in different locations: source: EU, destination: US, error: Cannot read and write in different locations: source: EU, destination: US

I'm starting the job with --zone=europe-west1-b

我正在用--zone = europe-west1-b开始工作

And this is the only part of the pipeline that does anything with BigQuery:

这是管道中唯一可以用BigQuery做任何事情的部分:

Pipeline p = Pipeline.create(options);
p.apply(BigQueryIO.Read.fromQuery(query));

The BigQuery table I'm reading from has this in the details: Data Location EU

我正在阅读的BigQuery表中有详细信息:数据位置EU

When I run the job locally, I get:

当我在本地工作时,我得到:

SEVERE: Error opening BigQuery table  dataflow_temporary_table_339775 of dataset _dataflow_temporary_dataset_744662  : 404 Not Found

I don't understand why it is trying to write to a different location if I'm only reading data. And even if it needs to create a temporary table, why is it being created in a different region?

如果我只是在阅读数据,我不明白为什么要写到不同的位置。即使它需要创建一个临时表,为什么它是在不同的区域创建的?

Any ideas?

有任何想法吗?

1 个解决方案

#1


2  

I would suggest to verify:

我建议验证:

  • If the staging location for the Google Dataflow is in the same zone.
  • 如果Google数据流的暂存位置位于同一区域中。
  • If Google Cloud Storage location used in Dataflow is also the in same zone.
  • 如果Dataflow中使用的Google云端存储位置也位于同一区域中。

#1


2  

I would suggest to verify:

我建议验证:

  • If the staging location for the Google Dataflow is in the same zone.
  • 如果Google数据流的暂存位置位于同一区域中。
  • If Google Cloud Storage location used in Dataflow is also the in same zone.
  • 如果Dataflow中使用的Google云端存储位置也位于同一区域中。