在hive中使用qualified Row_Number

时间:2022-04-01 22:59:33

I'm working with Teradata conversion to Hive (version 0.10.0).

我正在使用Teradata到Hive的转换(版本0.10.0)。

Teradata Query :

Teradata查询:

QUALIFY ROW_NUMBER() OVER (PARTITION BY ADJSTMNT,SRC_CMN , TYPE_CMD,IOD_TYPE_CD,ROE_PST ,ORDR_SYC,SOR_CD,PROS_ED ORDER BY ADJSTMNT )=1

I did my search and found UDF for Row_Sequence in hive. I also replaced Over Partition with Distribute All and sort By. But I am stuck with QUALIFY.

我搜索了一下,并在hive中找到了Row_Sequence的UDF。我还用分布All和sort By替换了分区。但我被限制在资格。

Any ideas to convert the above to hive are really appreciated and will help us a lot.

任何将上面的想法转化为蜂巢的想法都是非常值得欣赏的,并且会对我们有很大的帮助。

1 个解决方案

#1


5  

a QUALIFY with analytics function (ROW_NUMBER(), SUM(), COUNT(), ... over (partition by ...)) is just a WHERE on a subquery containing the analytics value.

使用分析函数(ROW_NUMBER()、SUM()、COUNT()、…over (partition by…)只是包含分析值的子查询的WHERE。

eg:

例如:

select A,B,C
from X 
QUALIFY  ROW_NUMBER() over (...) = 1

is equivalent to :

等价于:

select A,B,C
from (
   select A,B,C, ROW_NUMBER() over (...) as RNUM
   from X
) t
where RNUM = 1

NB: analytics function are available in Hive 0.12

NB:在Hive 0.12中有分析功能

#1


5  

a QUALIFY with analytics function (ROW_NUMBER(), SUM(), COUNT(), ... over (partition by ...)) is just a WHERE on a subquery containing the analytics value.

使用分析函数(ROW_NUMBER()、SUM()、COUNT()、…over (partition by…)只是包含分析值的子查询的WHERE。

eg:

例如:

select A,B,C
from X 
QUALIFY  ROW_NUMBER() over (...) = 1

is equivalent to :

等价于:

select A,B,C
from (
   select A,B,C, ROW_NUMBER() over (...) as RNUM
   from X
) t
where RNUM = 1

NB: analytics function are available in Hive 0.12

NB:在Hive 0.12中有分析功能