Hive实现交叉二维分析的小语句

时间:2023-03-09 04:08:25
Hive实现交叉二维分析的小语句

Hive实现交叉二维分析的小语句

1. 梳理出你要的列和行维度

列维度: 每一周

行维度: 年级 + 学科 + 班型

2. 对数据按周增序进行聚合 (即根据列维度) ,生成list

concat_ws 和 collect_list (collect_set 会去重后再聚合)  顺序随机

sort_array 只能增序,要倒序排的话在子查询里新增一个辅助列来排序即可。

3. 依次取list的元素

即为 按周增序的指标结果

  select
term,
kemu,
course_applicable_user_type,
split(hs,',')[0] lesson_order1,
split(hs,',')[1] lesson_order2,
split(hs,',')[2] lesson_order3,
split(hs,',')[3] lesson_order4,
split(hs,',')[4] lesson_order5
from
(
select
term,
kemu,
course_applicable_user_type,
-- concat_ws(',', collect_list(cast(lesson_order as string))) as lesson_order_set,
-- concat_ws(',', collect_list(cast(lesson_valid_rate as string))) as index_amount_set,
regexp_replace(
concat_ws(',',
sort_array (
collect_list(
concat_ws(':',
case when length(cast(lesson_order as string))=1 then concat('0',cast(lesson_order as string)) else cast(lesson_order as string) end,
cast(lesson_valid_rate as string)
)
)
)
),'\\d\\d\:',''
)hs
from
(
select
term,
kemu,
course_applicable_user_type,
lesson_order,
lesson_valid_rate
from tmp
)t
group by term,kemu,course_applicable_user_type
)t1