Oracle SQL:排除行,其中TimeStamps彼此在几分钟之内

时间:2021-11-23 01:30:36

So I have a table of transactions. I need to exclude any transactions that are within 15 minutes of the previous transaction for the same USER ID.

所以我有一张交易表。我需要排除相同USER ID在上一次交易的15分钟内发生的任何交易。

EXAMPLE

USERID          TRANS_TIME  
----------------------------------------  
00000001    24-FEB-17 15.13.51.713000000
00000001    16-MAR-17 10.10.20.781000000
00000001    16-MAR-17 10.10.32.659000000
00000001    16-MAR-17 10.13.04.070000000
00000001    16-MAR-17 10.13.49.339000000
00000001    16-MAR-17 10.22.33.467000000
00000001    16-MAR-17 10.23.09.755000000
00000001    16-MAR-17 10.25.51.994000000
00000001    16-MAR-17 10.26.08.130000000
00000001    29-MAR-17 10.23.01.665000000

So I would end up with 4 rows.

所以我最终会有4行。

USER ID         TRANS_TIME  
----------------------------------------  
00000001    24-FEB-17 15.13.51.713000000
00000001    16-MAR-17 10.10.20.781000000
00000001    16-MAR-17 10.25.51.994000000
00000001    29-MAR-17 10.23.01.665000000

Any ideas or tips on how to code for this? Ideally without creating a function or a procedure.

有关如何为此编码的任何想法或提示?理想情况下,无需创建函数或过程。

Cheers.

干杯。

3 个解决方案

#1


1  

Interpreting your required logic as follows:

解释您所需的逻辑如下:

Separately for each userid, include the row with the earliest transaction time. Then, for each row, look to see if it is within 15 minutes (<=) of the most recent included row, and if it is, then exclude this "current" row you are examining. If the new row is not within 15 minutes of the most recently included row, then include this new row.

对于每个用户标识,分别包括具有最早事务时间的行。然后,对于每一行,查看它是否在最近包含的行的15分钟(<=)内,如果是,则排除您正在检查的此“当前”行。如果新行不在最近包含的行的15分钟内,则包括此新行。

In other words, there are 15 minute sessions. A row opens a new session if it is not already in a session opened by another row. In this arrangement, as demonstrated by your desired output, it is not enough to compare a row to the one immediately preceding it.

换句话说,有15分钟的会议。如果行尚未在另一行打开的会话中,则会打开一个新会话。在这种安排中,如您所需的输出所示,仅将行与其前一行进行比较是不够的。

This problem can be solved very easily with the MATCH_RECOGNIZE clause in Oracle 12.1 and above. Alas, this is not available in Oracle 11 or earlier.

使用Oracle 12.1及更高版本中的MATCH_RECOGNIZE子句可以非常轻松地解决此问题。唉,这在Oracle 11或更早版本中不可用。

with
     test_data ( userid, trans_time ) as (
       select '00000001', to_timestamp('24-FEB-17 15.13.51.713000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.10.20.781000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.10.32.659000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.13.04.070000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.13.49.339000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.22.33.467000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.23.09.755000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.25.51.994000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.26.08.130000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('29-MAR-17 10.23.01.665000000', 'dd-MON-yy hh24.mi.ss.ff') from dual
     )
-- End of test data (not part of the solution). SQL query begins below this line.
select userid, session_start as trans_time
from   test_data
match_recognize (
  partition by userid
  order by     trans_time
  measures     a.trans_time as session_start
  pattern      ( a b* )
  define       b as b.trans_time <= a.trans_time + interval '15' minute
)
order by userid, trans_time    --   if needed
;

USERID    TRANS_TIME             
--------  ------------------------------
00000001  24-FEB-2017 15.13.51.713000000
00000001  16-MAR-2017 10.10.20.781000000
00000001  16-MAR-2017 10.25.51.994000000
00000001  29-MAR-2017 10.23.01.665000000

#2


1  

Just use lag():

只需使用lag():

select t.*
from (select t.*,
             lag(trans_time) over (partition by userid order by trans_time) as prev_tt
      from t
     ) t
where prev_tt is null or
      trans_time > prev_tt + (15 / (24 * 60));

Note: You can write the where using interval notation instead (that is actually a better approach):

注意:您可以使用区间符号来编写where(这实际上是一种更好的方法):

where prev_tt is null or
      trans_time > prev_tt + interval '15' minute;

#3


1  

With the same assumptions I made in my other answer (using the MATCH_RECOGNIZE clause), here is another way to solve the problem.

我在另一个答案中使用相同的假设(使用MATCH_RECOGNIZE子句),这是解决问题的另一种方法。

This solution uses recursive subquery factoring (recursive CTE), and therefore will work in Oracle 11.2 (but, unfortunately, not in earlier versions).

此解决方案使用递归子查询因子(递归CTE),因此将在Oracle 11.2中工作(但遗憾的是,不在早期版本中)。

with
-- Begin test data (not part of the solution)
     test_data ( userid, trans_time ) as (
       [     select ......    SAME AS IN THE OTHER ANSWER     ]
     ),
-- End of test data (not part of the solution). SQL query begins below this line.
     prep ( userid, trans_time, rn ) as (
       select userid, trans_time, 
              row_number() over (partition by userid order by trans_time)
       from   test_data
     ),
     rec ( userid, trans_time, rn, session_start ) as (
       select     userid, min(trans_time), 1, min(trans_time)
         from     prep
         group by userid
       union all
         select   p.userid, p.trans_time, p.rn,
                  case when p.trans_time > r.session_start + interval '15' minute
                       then p.trans_time
                       else r.session_start
                  end
         from     prep p join rec r on p.userid = r.userid and p.rn = r.rn + 1
     )
select   distinct userid, trans_time
from     rec
where    trans_time = session_start
order by userid, trans_time       --   if needed
;

#1


1  

Interpreting your required logic as follows:

解释您所需的逻辑如下:

Separately for each userid, include the row with the earliest transaction time. Then, for each row, look to see if it is within 15 minutes (<=) of the most recent included row, and if it is, then exclude this "current" row you are examining. If the new row is not within 15 minutes of the most recently included row, then include this new row.

对于每个用户标识,分别包括具有最早事务时间的行。然后,对于每一行,查看它是否在最近包含的行的15分钟(<=)内,如果是,则排除您正在检查的此“当前”行。如果新行不在最近包含的行的15分钟内,则包括此新行。

In other words, there are 15 minute sessions. A row opens a new session if it is not already in a session opened by another row. In this arrangement, as demonstrated by your desired output, it is not enough to compare a row to the one immediately preceding it.

换句话说,有15分钟的会议。如果行尚未在另一行打开的会话中,则会打开一个新会话。在这种安排中,如您所需的输出所示,仅将行与其前一行进行比较是不够的。

This problem can be solved very easily with the MATCH_RECOGNIZE clause in Oracle 12.1 and above. Alas, this is not available in Oracle 11 or earlier.

使用Oracle 12.1及更高版本中的MATCH_RECOGNIZE子句可以非常轻松地解决此问题。唉,这在Oracle 11或更早版本中不可用。

with
     test_data ( userid, trans_time ) as (
       select '00000001', to_timestamp('24-FEB-17 15.13.51.713000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.10.20.781000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.10.32.659000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.13.04.070000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.13.49.339000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.22.33.467000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.23.09.755000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.25.51.994000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.26.08.130000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('29-MAR-17 10.23.01.665000000', 'dd-MON-yy hh24.mi.ss.ff') from dual
     )
-- End of test data (not part of the solution). SQL query begins below this line.
select userid, session_start as trans_time
from   test_data
match_recognize (
  partition by userid
  order by     trans_time
  measures     a.trans_time as session_start
  pattern      ( a b* )
  define       b as b.trans_time <= a.trans_time + interval '15' minute
)
order by userid, trans_time    --   if needed
;

USERID    TRANS_TIME             
--------  ------------------------------
00000001  24-FEB-2017 15.13.51.713000000
00000001  16-MAR-2017 10.10.20.781000000
00000001  16-MAR-2017 10.25.51.994000000
00000001  29-MAR-2017 10.23.01.665000000

#2


1  

Just use lag():

只需使用lag():

select t.*
from (select t.*,
             lag(trans_time) over (partition by userid order by trans_time) as prev_tt
      from t
     ) t
where prev_tt is null or
      trans_time > prev_tt + (15 / (24 * 60));

Note: You can write the where using interval notation instead (that is actually a better approach):

注意:您可以使用区间符号来编写where(这实际上是一种更好的方法):

where prev_tt is null or
      trans_time > prev_tt + interval '15' minute;

#3


1  

With the same assumptions I made in my other answer (using the MATCH_RECOGNIZE clause), here is another way to solve the problem.

我在另一个答案中使用相同的假设(使用MATCH_RECOGNIZE子句),这是解决问题的另一种方法。

This solution uses recursive subquery factoring (recursive CTE), and therefore will work in Oracle 11.2 (but, unfortunately, not in earlier versions).

此解决方案使用递归子查询因子(递归CTE),因此将在Oracle 11.2中工作(但遗憾的是,不在早期版本中)。

with
-- Begin test data (not part of the solution)
     test_data ( userid, trans_time ) as (
       [     select ......    SAME AS IN THE OTHER ANSWER     ]
     ),
-- End of test data (not part of the solution). SQL query begins below this line.
     prep ( userid, trans_time, rn ) as (
       select userid, trans_time, 
              row_number() over (partition by userid order by trans_time)
       from   test_data
     ),
     rec ( userid, trans_time, rn, session_start ) as (
       select     userid, min(trans_time), 1, min(trans_time)
         from     prep
         group by userid
       union all
         select   p.userid, p.trans_time, p.rn,
                  case when p.trans_time > r.session_start + interval '15' minute
                       then p.trans_time
                       else r.session_start
                  end
         from     prep p join rec r on p.userid = r.userid and p.rn = r.rn + 1
     )
select   distinct userid, trans_time
from     rec
where    trans_time = session_start
order by userid, trans_time       --   if needed
;