如何在BigQuery标准SQL中对数组进行排序?

时间:2021-11-05 01:14:02

I am wondering if it is possible to order (apply order by) for individual array values in Google BigQuery?

我想知道是否可以在Google BigQuery中为单个数组值订购(应用order by)?

I am able to achieve this by applying order by on the whole transactonal base table first, then aggregating array; but when table is too large, resource errors appear for ordering by a large table..

我能够通过首先在整个transactonal基表上应用order,然后聚合数组来实现这一点;但是当表太大时,会出现资源错误,以便通过大表进行排序。

So i am wondering if each individual array value can be ordered by using SQL or UDF.

所以我想知道是否可以使用SQL或UDF对每个单独的数组值进行排序。

This was asked once Order of data in bigquery repeated records but it was 4,5 years ago.

这曾经被问过一次大数据重复记录中的数据顺序,但这是4,5年前。

1 个解决方案

#1


3  

Sure, you can use the ARRAY function. It supports an optional ORDER BY clause. You haven't provided sample data, but supposing that you have a top level array column named arr, you can do something like this:

当然,您可以使用ARRAY功能。它支持可选的ORDER BY子句。您还没有提供示例数据,但假设您有一个名为arr的*数组列,您可以执行以下操作:

SELECT
  col1,
  col2,
  ARRAY(SELECT x FROM UNNEST(arr) AS x ORDER BY x) AS arr
FROM MyTable;

This sorts the elements of arr by their values. If you actually have an array of a struct type, such as ARRAY<STRUCT<a INT64, b STRING>>, you can sort by one of the struct fields:

这通过它们的值对arr的元素进行排序。如果你实际上有一个struct类型的数组,比如ARRAY >,你可以按以下结构字段排序:

SELECT
  col1,
  col2,
  ARRAY(SELECT x FROM UNNEST(arr) AS x ORDER BY a) AS arr
FROM MyTable;

#1


3  

Sure, you can use the ARRAY function. It supports an optional ORDER BY clause. You haven't provided sample data, but supposing that you have a top level array column named arr, you can do something like this:

当然,您可以使用ARRAY功能。它支持可选的ORDER BY子句。您还没有提供示例数据,但假设您有一个名为arr的*数组列,您可以执行以下操作:

SELECT
  col1,
  col2,
  ARRAY(SELECT x FROM UNNEST(arr) AS x ORDER BY x) AS arr
FROM MyTable;

This sorts the elements of arr by their values. If you actually have an array of a struct type, such as ARRAY<STRUCT<a INT64, b STRING>>, you can sort by one of the struct fields:

这通过它们的值对arr的元素进行排序。如果你实际上有一个struct类型的数组,比如ARRAY >,你可以按以下结构字段排序:

SELECT
  col1,
  col2,
  ARRAY(SELECT x FROM UNNEST(arr) AS x ORDER BY a) AS arr
FROM MyTable;