presto函数检查数组是否包含子数组

时间:2021-11-12 21:55:40

I have a Presto database with a column array which contains for example:

我有一个包含列数组的Presto数据库,例如:

  1. id1,[1,2,3,4]
  2. id1、(1、2、3、4)
  3. id2,[3,4,5,6]
  4. id2,(3、4、5、6)
  5. id3,[3,4,7,8]
  6. id3,(3、4、7、8)
  7. id4,[5,4,3,6]
  8. id4,5、4、3、6]

I need a to search which rows contains the array [3,4,5] in the correct order. So for instance the result should return only id2 but not id4.

我需要a搜索哪些行以正确的顺序包含数组[3,4,5]。例如,结果应该只返回id2而不是id4。

I can use array_intersect in combination with cardinality to find id2,id4 but I don't know how can I verify that id2 or id4 are in the correct order.

我可以结合使用array_intersect和基数查找id2、id4,但我不知道如何验证id2或id4的顺序是否正确。

The only ugly solution I can think of is to convert the two arrays into a string and then do a string like operation.

我能想到的唯一糟糕的解决方案是将两个数组转换成一个字符串,然后执行一个类似字符串的操作。

Any better ideas?

有更好的主意吗?

Following the suggestion below and using AWS Athena:

按照以下建议并使用AWS雅典娜:

WITH dataset AS (
    (values array[1,2,3,4], 
    array[3,4,5,6], 
    array[3,4,7,8], 
    array[5,4,3,6])
)
SELECT ngrams FROM dataset t(ngrams) where reduce(
    transform(array[3,4,5], a -> array_position(ngrams, a)),
    0, 
    (s, n) -> if( s < 0, -1, if ( n > s, n, -1)),
    s -> s >= 0) ;

The error I get is:

我得到的错误是:

SYNTAX_ERROR: line 7:44: Unexpected parameters (array(bigint), integer, com.facebook.presto.sql.analyzer.TypeSignatureProvider@1d8b3792, com.facebook.presto.sql.analyzer.TypeSignatureProvider@563900c2) for function reduce. Expected: reduce(array(T), S, function(S,T,S), function(S,R)) T, S, R

SYNTAX_ERROR:第7:44行:意外参数(数组(bigint),整数,com.facebook.presto.sql.analyzer。为减少功能,请输入“TypeSignatureProvider@1d8b3792, com. facebook.presto.html。期望:reduce(array(T), S,函数(S,T,S),函数(S,R)) T,S,R

1 个解决方案

#1


0  

Here comes the magic for you:

你的魔力来了:

select x 
from (values 
    array[1,2,3,4], 
    array[3,4,5,6], 
    array[3,4,7,8], 
    array[5,4,3,6]) t(x)
where reduce(
    transform(array[3,4,5], a -> array_position(x, a)),
    0, 
    (s, n) -> if( s < 0, -1, if ( n > s, n, -1)),
    s -> s >= 0) 

The above find each element in queried array and returns true if position array is is increasing. This still have a lot of corner cases to solve (handling duplicates or gaps), but I hope this is something you can start to work with.

上面的函数查找查询数组中的每个元素,如果位置数组增加,则返回true。这仍然有很多需要解决的问题(处理重复的或空白),但是我希望这是您可以开始使用的东西。

See https://prestodb.io/docs/current/functions/array.html for more details

见https://prestodb.io/docs/current/functions/array。html为更多的细节

#1


0  

Here comes the magic for you:

你的魔力来了:

select x 
from (values 
    array[1,2,3,4], 
    array[3,4,5,6], 
    array[3,4,7,8], 
    array[5,4,3,6]) t(x)
where reduce(
    transform(array[3,4,5], a -> array_position(x, a)),
    0, 
    (s, n) -> if( s < 0, -1, if ( n > s, n, -1)),
    s -> s >= 0) 

The above find each element in queried array and returns true if position array is is increasing. This still have a lot of corner cases to solve (handling duplicates or gaps), but I hope this is something you can start to work with.

上面的函数查找查询数组中的每个元素,如果位置数组增加,则返回true。这仍然有很多需要解决的问题(处理重复的或空白),但是我希望这是您可以开始使用的东西。

See https://prestodb.io/docs/current/functions/array.html for more details

见https://prestodb.io/docs/current/functions/array。html为更多的细节