无法写入bigquery - 权限被拒绝:Apache Beam Python - Google Dataflow

时间:2022-05-10 01:15:21

I have been using apache beam python sdk using google cloud dataflow service for quite some time now.

我使用谷歌云数据流服务已经使用apache beam python sdk已经有一段时间了。

I was setting dataflow up for a new project.

我正在为新项目设置数据流。

The dataflow pipeline

数据流管道

  1. Reads data from google datastore
  2. 从Google数据存储中读取数据
  3. Processes it
  4. 处理它
  5. Writes to Google Big-Query.
  6. 写入Google Big-Query。

I have similar pipelines running on other projects which are running perfectly fine.

我有类似的管道运行在其他项目上运行完美。

Today, When I started a dataflow job, the pipeline started, read data from datastore, processed it and when it was about to write it to bigquery, It resulted in

今天,当我启动数据流作业时,管道启动,从数据存储区读取数据,处理它以及何时将其写入bigquery,结果是

apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: 
Dataflow pipeline failed. State: FAILED, Error:
Workflow failed. Causes: S04:read from datastore/GroupByKey/Read+read 
from datastore/GroupByKey/GroupByWindow+read from datastore/Values+read 
from datastore/Flatten+read from datastore/Read+convert to table 
rows+write to bq/NativeWrite failed., BigQuery import job 
"dataflow_job_8287310405217525944" failed., BigQuery creation of import 
job for table "TableABC" in dataset "DatasetABC" in project "devel- 
project-abc" failed., BigQuery execution failed., Error:
Message: Access Denied: Dataset devel-project-abc:DatasetABC: The user 
service-account-number-compute@developer.gserviceaccount.com does not 
have bigquery.tables.create permission for dataset devel-project- 
abc:DatasetABC: HTTP Code: 403

I made sure all the required API are enabled. According to me the service account has the necessary permission.

我确保启用了所有必需的API。据我所知,服务帐户有必要的许可。

My question is Where this might be going wrong?

我的问题是这可能会出错?

Update

更新

From what I remember on previous projects (3 different projects to be precise) I didn't give the dataflow service agent any specific permission. The compute engine service agent had permissions like dataflow admin, editor, dataflow viewer. Hence before proceeding with giving the service agent permissions related to bigquery, i would like to know why the environment is behaving differently than the previous projects.

从我以前的项目中记得(准确地说是3个不同的项目),我没有给数据流服务代理任何特定的权限。计算引擎服务代理具有数据流管理,编辑器,数据流查看器等权限。因此,在继续向服务代理提供与bigquery相关的权限之前,我想知道为什么环境的行为与以前的项目不同。

Is there any permission/policy changes/updates that went live in last few months resulting in requirement of bigquery writer permission?

是否有任何权限/政策更改/更新在过去几个月内生效,导致需要大量作者的许可?

2 个解决方案

#1


1  

Please make sure your service account ('service-account-number-compute@developer.gserviceaccount.com') has 'roles/bigquery.dataEditor' role in 'devel-project-abc:DatasetABC'. Also make sure 'BigQuery Data Editor' role is enabled for your project.

请确保您的服务帐户('service-account-number-compute@developer.gserviceaccount.com')在'devel-project-abc:DatasetABC'中具有'roles / bigquery.dataEditor'角色。还要确保为项目启用了“BigQuery Data Editor”角色。

GCP IAM is where you can check those.

GCP IAM是您可以查看的地方。

#2


0  

You can find the capabilities for each role for BigQuery here. If your previous projects were using primitive IAM roles, then you might need to set correctly. IAM Release Notes page is provided here which provides additional information on the updates done to the system.

您可以在此处找到BigQuery的每个角色的功能。如果以前的项目使用原始IAM角色,则可能需要正确设置。此处提供了IAM发行说明页面,其中提供了有关对系统执行的更新的其他信息。

#1


1  

Please make sure your service account ('service-account-number-compute@developer.gserviceaccount.com') has 'roles/bigquery.dataEditor' role in 'devel-project-abc:DatasetABC'. Also make sure 'BigQuery Data Editor' role is enabled for your project.

请确保您的服务帐户('service-account-number-compute@developer.gserviceaccount.com')在'devel-project-abc:DatasetABC'中具有'roles / bigquery.dataEditor'角色。还要确保为项目启用了“BigQuery Data Editor”角色。

GCP IAM is where you can check those.

GCP IAM是您可以查看的地方。

#2


0  

You can find the capabilities for each role for BigQuery here. If your previous projects were using primitive IAM roles, then you might need to set correctly. IAM Release Notes page is provided here which provides additional information on the updates done to the system.

您可以在此处找到BigQuery的每个角色的功能。如果以前的项目使用原始IAM角色,则可能需要正确设置。此处提供了IAM发行说明页面,其中提供了有关对系统执行的更新的其他信息。