从非结构化表创建序列

时间:2022-08-04 15:30:59

How to split the below unstructured table into 500 records for each batch based on CREATED_DT < SYSDATE -2 considering there are more 90,000 records? I cannot just take the CREATED_DT as the date and the value will change

考虑到有更多的90,000条记录,如何根据CREATED_DT

NUMBERID | MODEL| NAME | VALUE | CREATED_DT | LOADED_DT
1000     | BOE  | AGT  | 300   | 02-AUG-2016| 02-AUG-2016
1001     | AIR  | ACT  | 400   | 01-AUG-2016| 01-AUG-2016
1000     | EMB  | ADT  | 500   | 02-AUG-2016| 01-AUG-2016
1004     | REL  | GTR  | 140   | 01-AUG-2016| 01-AUG-2016
1001     | SIM  | HJT  | 140   | 01-JUL-2016| 28-JUL-2016
1002     | SIM  | ACT  | 540   | 04-JUL-2016| 04-JUL-2016

I can't use the difference of MAX(NUMBER_ID) MIN(NUMBER_ID) AND CREATED_DT as there is high risk of some data being lost. Can I use Oracle ROWID column (system column) column to split the records without missing any records?

我不能使用MAX(NUMBER_ID)MIN(NUMBER_ID)和CREATED_DT的差异,因为有些数据丢失的风险很高。我可以使用Oracle ROWID列(系统列)列来拆分记录而不会丢失任何记录吗?

Thanks for your help !

谢谢你的帮助 !

1 个解决方案

#1


2  

The analytic function ROW_NUMBER can help uniquely order and chunk the rows.

分析函数ROW_NUMBER可以帮助对行进行唯一排序和分块。

Sample Schema

drop table test1;

create table test1(numberid number, created_dt date);

insert into test1
select level, sysdate - level * interval '1' second
from dual connect by level <= 100000;

Sample Query and Results

示例查询和结果

select
    numberid,
    created_dt,
    ceil(row_number() over (order by created_dt desc) / 500) batch_id
from test1
order by created_dt desc;


NUMBERID   CREATED_DT               BATCH_ID
--------   ----------               --------
1          2016-08-02 11:42:48 PM   1
2          2016-08-02 11:42:47 PM   1
3          2016-08-02 11:42:46 PM   1
...
499        2016-08-02 11:34:30 PM   1
500        2016-08-02 11:34:29 PM   1
501        2016-08-02 11:34:28 PM   2
...

#1


2  

The analytic function ROW_NUMBER can help uniquely order and chunk the rows.

分析函数ROW_NUMBER可以帮助对行进行唯一排序和分块。

Sample Schema

drop table test1;

create table test1(numberid number, created_dt date);

insert into test1
select level, sysdate - level * interval '1' second
from dual connect by level <= 100000;

Sample Query and Results

示例查询和结果

select
    numberid,
    created_dt,
    ceil(row_number() over (order by created_dt desc) / 500) batch_id
from test1
order by created_dt desc;


NUMBERID   CREATED_DT               BATCH_ID
--------   ----------               --------
1          2016-08-02 11:42:48 PM   1
2          2016-08-02 11:42:47 PM   1
3          2016-08-02 11:42:46 PM   1
...
499        2016-08-02 11:34:30 PM   1
500        2016-08-02 11:34:29 PM   1
501        2016-08-02 11:34:28 PM   2
...