如何为PostgreSQL中的数组元素创建索引?

时间:2021-04-19 21:21:31

With this schema:

这个模式:

create table object (
   obj_id      serial      primary key,
   name        varchar(80) not null unique,
   description text,
   tag_arr     int[]
);

create table tag (
   tag_id      serial      primary key,
   label       varchar(20) not null unique
);

An object may have any number of tags attached. Instead of an object X tag table, I wish to keep tag_ids in an array so they can be easily fetched with the object record.

对象可以附加任意数量的标记。我希望将tag_id保存在数组中,而不是对象X标记表,这样就可以用对象记录轻松地获取它们。

How do I create an index on object so that each element of tar_arr is an index?

如何在对象上创建索引,使tar_arr的每个元素都成为索引?

That said, are there better ways to solve this problem?

也就是说,有更好的方法来解决这个问题吗?

Discussion

This could be achieved with:

这可以通过以下方式实现:

create table obj_x_tag(
   obj_id    references object,
   tag_id    references tag,
   constraint obj_x_tag_pk primary key( obj_id, tag_id )
);

select obj_id, name, description, array_agg( tag_id )
from object o
join obj_x_tag x using( obj_id )
group by 1, 2;

But to me it makes more sense to simply keep the array of tag_ids in a column and dispense with the cross table and array_agg()

但是对我来说,简单地将tag_id数组放在列中,并且不使用交叉表和array_agg()更有意义

It was suggested to use PostgresQL SQL: Converting results to array. The problem, as noted, is that "this doesn't actually index individual array values, but instead indexes the entire array"

建议使用PostgresQL:将结果转换为数组。问题在于,如前所述,“它实际上并没有索引单个数组值,而是索引整个数组”

It was also suggested to use pg's intarr and gist (or gin) index. The problem - to me - seems that the index is for the standard pg set-based array operators, not necessarily optimized for finding one element of an array, but rather where one array contains another, intersects with another - for me it's counter-intuitive that, size-wise and speed-wise, such a wide solution is correct for such a narrow problem. Also, the intarr extension seems limited to int, not covering int64 or char, limiting its usefulness.

还建议使用pg的intarr和gist(或gin)索引。问题——我——似乎,标准的指数是pg运营商基于集合的数组,不一定优化寻找一个元素的数组,而是在一个数组中包含了另一个,与另一个相交——对我来说,这是反直觉的,大小和速度明智,如此大的解决方案是正确的对于这样一个狭窄的问题。此外,intarr扩展似乎仅限于int,没有覆盖int64或char,这限制了它的可用性。

3 个解决方案

#1


3  

You can create GIN indexes on any 1-dimensional array with standard Postgres.
Details in the manual here (last chapter).

您可以使用标准Postgres在任何一维数组上创建GIN索引。手册的细节在这里(最后一章)。

While operating with integer arrays (plain int4, not int2 or int8 and no NULL values) the additional supplied module intarray provides a lot more operators and typically superior performance. Install it (once per database) with:

在使用整数数组(普通的int4,而不是int2或int8,没有空值)操作时,额外提供的模块intarray提供了更多的操作符和典型的优越性能。安装(每个数据库一次):

CREATE EXTENSION intarray;

You can create GIN or GIST indexes on integer arrays. There are examples in the manual.
CREATE EXTENSION requires PostgreSQL 9.1 or later. For older versions you need to run the supplied script.

您可以在整数数组上创建GIN或GIST索引。手册中有一些例子。创建扩展需要PostgreSQL 9.1或更高版本。对于旧版本,您需要运行所提供的脚本。

#2


3  

The traditional solution would be to use a table of tags and a many-many between tag and object. Then you can index the tag table and pull everything in a single select statement via a join. If you're not happy with the programming model, check out your local friendly ORM vendor.

传统的解决方案是使用标记表和标记和对象之间的多个标记。然后可以对标记表进行索引,并通过连接将所有内容提取到一个select语句中。如果您对编程模型不满意,请查看您的本地友好ORM供应商。

I'm not a PostgreSQL expert by any means, but this doesn't seem like a good use case for arrays.

无论如何,我都不是PostgreSQL专家,但这似乎不是一个很好的数组用例。

#3


0  

This is my workaround, because I see no PostgreSQL optimized internal function for do the same,

这是我的解决方案,因为我没有看到PostgreSQL优化的内部函数,

CREATE FUNCTION unnest_with_idx(anyarray) RETURNS 
table(idx integer, val anyelement) AS $$ 
   SELECT generate_series(1,array_upper($1,1)) as idx, unnest($1) as val;
$$ LANGUAGE SQL IMMUTABLE;
-- Test:
SELECT idx,val from unnest_with_idx(array[1,20,3,5]) as t;

For check if exists an internal function, see "How to acess array internal index with postgreSQL?" question.

有关检查是否存在内部函数,请参见“如何使用postgreSQL访问数组内部索引?”问题。


Edited after @JimNasby comment

编辑后@JimNasby评论

Solution for pg9.4+

SELECT * FROM unnest(array[20,11,3,5]) WITH ORDINALITY;

the WITH ORDINALITY produces a new column "ordinality" that is the array-index. See also this tutorial.

与序数生成一个新的列“序数”,即数组索引。参见本教程。

In pg9.5+, it works fine also for JSON arrays!

在pg9.5+中,它也适用于JSON数组!

 SELECT * FROM jsonb_array_elements( '[20,11,3,5]'::JSONB ) WITH ORDINALITY

#1


3  

You can create GIN indexes on any 1-dimensional array with standard Postgres.
Details in the manual here (last chapter).

您可以使用标准Postgres在任何一维数组上创建GIN索引。手册的细节在这里(最后一章)。

While operating with integer arrays (plain int4, not int2 or int8 and no NULL values) the additional supplied module intarray provides a lot more operators and typically superior performance. Install it (once per database) with:

在使用整数数组(普通的int4,而不是int2或int8,没有空值)操作时,额外提供的模块intarray提供了更多的操作符和典型的优越性能。安装(每个数据库一次):

CREATE EXTENSION intarray;

You can create GIN or GIST indexes on integer arrays. There are examples in the manual.
CREATE EXTENSION requires PostgreSQL 9.1 or later. For older versions you need to run the supplied script.

您可以在整数数组上创建GIN或GIST索引。手册中有一些例子。创建扩展需要PostgreSQL 9.1或更高版本。对于旧版本,您需要运行所提供的脚本。

#2


3  

The traditional solution would be to use a table of tags and a many-many between tag and object. Then you can index the tag table and pull everything in a single select statement via a join. If you're not happy with the programming model, check out your local friendly ORM vendor.

传统的解决方案是使用标记表和标记和对象之间的多个标记。然后可以对标记表进行索引,并通过连接将所有内容提取到一个select语句中。如果您对编程模型不满意,请查看您的本地友好ORM供应商。

I'm not a PostgreSQL expert by any means, but this doesn't seem like a good use case for arrays.

无论如何,我都不是PostgreSQL专家,但这似乎不是一个很好的数组用例。

#3


0  

This is my workaround, because I see no PostgreSQL optimized internal function for do the same,

这是我的解决方案,因为我没有看到PostgreSQL优化的内部函数,

CREATE FUNCTION unnest_with_idx(anyarray) RETURNS 
table(idx integer, val anyelement) AS $$ 
   SELECT generate_series(1,array_upper($1,1)) as idx, unnest($1) as val;
$$ LANGUAGE SQL IMMUTABLE;
-- Test:
SELECT idx,val from unnest_with_idx(array[1,20,3,5]) as t;

For check if exists an internal function, see "How to acess array internal index with postgreSQL?" question.

有关检查是否存在内部函数,请参见“如何使用postgreSQL访问数组内部索引?”问题。


Edited after @JimNasby comment

编辑后@JimNasby评论

Solution for pg9.4+

SELECT * FROM unnest(array[20,11,3,5]) WITH ORDINALITY;

the WITH ORDINALITY produces a new column "ordinality" that is the array-index. See also this tutorial.

与序数生成一个新的列“序数”,即数组索引。参见本教程。

In pg9.5+, it works fine also for JSON arrays!

在pg9.5+中,它也适用于JSON数组!

 SELECT * FROM jsonb_array_elements( '[20,11,3,5]'::JSONB ) WITH ORDINALITY