BigQuery API - 如何提高查询读取性能

时间:2022-03-14 05:52:45

We're using BigQuery to retrieve the full content of a big table. We're using the publicly available publicdata:samples.natality.

我们使用BigQuery来检索大表的完整内容。我们正在使用公开的publicdata:samples.natality。

Our code follows Google instructions as described in their API doc - java.

我们的代码遵循其API doc-java中描述的Google说明。

We're able to retrieve this table at around 1'300 rows/sec that is amazingly slow. Is there a faster way to retrieve the full result of a query or is this always as fast as it gets ?

我们能够以大约1'300行/秒的速度检索此表,这非常慢。是否有更快的方法来检索查询的完整结果,或者这总是如此快?

1 个解决方案

#1


3  

The recommended way to retrieve a large amount of data from a BigQuery table is not to use tabledata.list to page through a full table as that example is using. That example is optimized for reading a small number of rows for the results of a query.

从BigQuery表中检索大量数据的推荐方法是不使用tabledata.list来遍历该示例正在使用的完整表。该示例针对查询结果读取少量行进行了优化。

Instead, you should run an extract job that exports the entire content of the table to Google Cloud Storage, which you can then download the full content from.

相反,您应该运行一个提取作业,将表格的整个内容导出到Google云端存储,然后您可以从中下载完整的内容。

https://cloud.google.com/bigquery/exporting-data-from-bigquery

https://cloud.google.com/bigquery/exporting-data-from-bigquery

#1


3  

The recommended way to retrieve a large amount of data from a BigQuery table is not to use tabledata.list to page through a full table as that example is using. That example is optimized for reading a small number of rows for the results of a query.

从BigQuery表中检索大量数据的推荐方法是不使用tabledata.list来遍历该示例正在使用的完整表。该示例针对查询结果读取少量行进行了优化。

Instead, you should run an extract job that exports the entire content of the table to Google Cloud Storage, which you can then download the full content from.

相反,您应该运行一个提取作业,将表格的整个内容导出到Google云端存储,然后您可以从中下载完整的内容。

https://cloud.google.com/bigquery/exporting-data-from-bigquery

https://cloud.google.com/bigquery/exporting-data-from-bigquery