如何使用ValueProviders在运行时为SpannerIO分配表和列?

时间:2022-10-01 15:33:50

I have a use case where in I have to pick up selected data from a spanner table and dump to BigQuery.

我有一个用例,我必须从扳手表中获取选定的数据并转储到BigQuery。

The catch here is that for the batch job the name of the table and the columns to select will only be known at runtime.

这里的问题是,对于批处理作业,表的名称和要选择的列只能在运行时知道。

It seems that dataflow's SpannerIO doesn't accept the table and the columns at runtime. Please refer below code for better understanding:

似乎dataflow的SpannerIO在运行时不接受表和列。请参阅以下代码以便更好地理解:

p.apply(SpannerIO.read().withSpannerConfig(spannerConfig)
                .withTable("tablename")
                .withColumns(list or array of columns))

It only accepts string and not ValueProviders. How to make this work?

它只接受字符串而不接受ValueProviders。如何使这项工作?

2 个解决方案

#1


1  

Yes, the withColumn and withTable methods of SpannerIO.Read do not take in a ValueProvider by default.

是的,默认情况下,SpannerIO.Read的withColumn和withTable方法不接受ValueProvider。

Could you write code outside your function to get the table and column names and then pass them into withTable and withColumn as a list of strings at runtime?

您是否可以在函数外部编写代码来获取表名和列名,然后在运行时将它们作为字符串列表传递给withTable和withColumn?

If they can be passed in as commandline arguments, consider using PipelineOptions.

如果它们可以作为命令行参数传递,请考虑使用PipelineOptions。

Here is a simple example. More docs on using dataflow connectors from Cloud Spanner can be found here.

这是一个简单的例子。可以在此处找到有关使用Cloud Spanner的数据流连接器的更多文档。

#2


1  

To access runtime values you need to use the ReadAll transform and build an instance of ReadOperation in the previous step.

要访问运行时值,您需要使用ReadAll转换并在上一步中构建ReadOperation实例。

See Reading data from all available tables from the examples.

请参阅示例中的所有可用表中的数据。

#1


1  

Yes, the withColumn and withTable methods of SpannerIO.Read do not take in a ValueProvider by default.

是的,默认情况下,SpannerIO.Read的withColumn和withTable方法不接受ValueProvider。

Could you write code outside your function to get the table and column names and then pass them into withTable and withColumn as a list of strings at runtime?

您是否可以在函数外部编写代码来获取表名和列名,然后在运行时将它们作为字符串列表传递给withTable和withColumn?

If they can be passed in as commandline arguments, consider using PipelineOptions.

如果它们可以作为命令行参数传递,请考虑使用PipelineOptions。

Here is a simple example. More docs on using dataflow connectors from Cloud Spanner can be found here.

这是一个简单的例子。可以在此处找到有关使用Cloud Spanner的数据流连接器的更多文档。

#2


1  

To access runtime values you need to use the ReadAll transform and build an instance of ReadOperation in the previous step.

要访问运行时值,您需要使用ReadAll转换并在上一步中构建ReadOperation实例。

See Reading data from all available tables from the examples.

请参阅示例中的所有可用表中的数据。