在where子句中奇怪的随机行为。

时间:2022-06-29 10:17:43

I have a table like this:

我有一张这样的桌子:

     Id | GroupId | Category
     ------------------------
     1  | 101     | A
     2  | 101     | B
     3  | 101     | C
     4  | 103     | B
     5  | 103     | D
     6  | 103     | A
     ........................

I need select one of the GroupId randomly. For this I have used the following PL/SQL code block:

我需要随机选择一个群体。为此,我使用了以下PL/SQL代码块:

declare v_group_count number;
  v_group_id number;
begin 
  select count(distinct GroupId) into v_group_count from MyTable;
  SELECT GroupId into v_group_id  FROM
  (
    SELECT GroupId, ROWNUM RN FROM 
    (SELECT DISTINCT GroupId FROM MyTable)
  )
  WHERE RN=Round(dbms_random.value(1, v_group_count));
end;

Because I rounded random value then it will be an integer value and the WHERE RN=Round(dbms_random.value(1, v_group_count)) condition must return one row always. Generally it gives me one row as expected. But strangely sometimes it gives me no rows and sometimes it returns two rows. That's why it gives error in this section:

因为我将随机值四舍五入,所以它将是一个整数值,而WHERE RN=Round(dbms_random)。值(1,v_group_count)条件必须始终返回一行。一般来说,它给了我一排预期的。但奇怪的是,有时它没有行,有时它返回两行。这就是为什么它在这一节中会出现错误:

SELECT GroupId into v_group_id

Anyone knows the reason of that behaviour?

有人知道这种行为的原因吗?

2 个解决方案

#1


23  

round(dbms_random.value(1, v_group_count)) is being executed for every row, so every row might be selected or not.

轮(dbms_random。值(1,v_group_count)正在为每一行执行,因此可能会选择或不选择每一行。


P.s.

注。

ROUND is a bad choice.

The probability of getting any of the edge values (e.g. 1 and 10) is half the probability of getting any other value (e.g. 2 to 9).
It is 0.0555... (1/18) Vs. 0.111... (1/9)

得到任何边值(如1和10)的概率是得到任何其他值(如2到9)的概率的一半。0.111(1/18)与…(1/9)

[  1,1.5) --> 1
[1.5,2.5) --> 2
.
.
.
[8.5,9.5) --> 9
[9.5, 10) --> 10

select          n,count(*)
from           (select          round(dbms_random.value(1, 10)) as n
                from            dual
                connect by      level <= 100000
                )
group by        n
order by        n
;

    N   COUNT(*)
    1   5488
    2   11239
    3   11236
    4   10981
    5   11205
    6   11114
    7   11211
    8   11048
    9   10959
    10  5519

My recommendation is to use FLOOR on dbms_random.value(1,N+1)

    select          n,count(*)
    from           (select          floor(dbms_random.value(1, 11)) as n
                    from            dual
                    connect by      level <= 100000
                    )
    group by        n
    order by        n   
    ;              

N   COUNT(*)
1   10091
2   10020
3   10020
4   10021
5   9908
6   10036
7   10054
8   9997
9   9846
10  10007              

#2


8  

If you want to select one randomly:

如果你想随机选择一个:

declare v_group_count number;
  v_group_id number;
begin 
  SELECT GroupId into v_group_id
  FROM (SELECT DISTINCT GroupId
        FROM MyTable
        ORDER BY dbms_random.value
       ) t
  WHERE rownum = 1
end;

#1


23  

round(dbms_random.value(1, v_group_count)) is being executed for every row, so every row might be selected or not.

轮(dbms_random。值(1,v_group_count)正在为每一行执行,因此可能会选择或不选择每一行。


P.s.

注。

ROUND is a bad choice.

The probability of getting any of the edge values (e.g. 1 and 10) is half the probability of getting any other value (e.g. 2 to 9).
It is 0.0555... (1/18) Vs. 0.111... (1/9)

得到任何边值(如1和10)的概率是得到任何其他值(如2到9)的概率的一半。0.111(1/18)与…(1/9)

[  1,1.5) --> 1
[1.5,2.5) --> 2
.
.
.
[8.5,9.5) --> 9
[9.5, 10) --> 10

select          n,count(*)
from           (select          round(dbms_random.value(1, 10)) as n
                from            dual
                connect by      level <= 100000
                )
group by        n
order by        n
;

    N   COUNT(*)
    1   5488
    2   11239
    3   11236
    4   10981
    5   11205
    6   11114
    7   11211
    8   11048
    9   10959
    10  5519

My recommendation is to use FLOOR on dbms_random.value(1,N+1)

    select          n,count(*)
    from           (select          floor(dbms_random.value(1, 11)) as n
                    from            dual
                    connect by      level <= 100000
                    )
    group by        n
    order by        n   
    ;              

N   COUNT(*)
1   10091
2   10020
3   10020
4   10021
5   9908
6   10036
7   10054
8   9997
9   9846
10  10007              

#2


8  

If you want to select one randomly:

如果你想随机选择一个:

declare v_group_count number;
  v_group_id number;
begin 
  SELECT GroupId into v_group_id
  FROM (SELECT DISTINCT GroupId
        FROM MyTable
        ORDER BY dbms_random.value
       ) t
  WHERE rownum = 1
end;