[20180626]函数与标量子查询14.txt

时间:2023-03-09 03:15:41
[20180626]函数与标量子查询14.txt

[20180626]函数与标量子查询14.txt

--//前面看http://www.cnblogs.com/kerrycode/p/9099507.html链接,里面提到:

通俗来将,当使用标量子查询的时候,ORACLE会将子查询结果缓存在哈希表中, 如果后续的记录出现同样的值,优化器通过缓存在哈希
表中的值,判断重复值不用重复调用函数,直接使用上次计算结果即可。从而减少调用函数次数,从而达到优化性能的效果。另外在
ORACLE 10和11中, 哈希表只包含了255个Buckets,也就是说它能存储255个不同值,如果超过这个范围,就会出现散列冲突,那些出现
散列冲突的值就会重复调用函数,即便如此,依然能达到大幅改善性能的效果。

--//前几天测试11.2.0.4 for linux下,哈希表不止255个Buckets.
--//另外也测试再10g下到底有512个Buckets.11.2.0.4是1024.
--//昨天看链接http://www.cnblogs.com/kerrycode/p/9223093.html =>ORACLE中Scalar subquery Caching的hash table大小测试浅析
--//里面建立一个函数,很容易确定调用函数的次数.重复测试,另外前面的测试有点慢,看看我现在的测试是否快一点.^_^ .

1.环境:
SCOTT@book> @ &r/ver1
PORT_STRING                    VERSION        BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

create or replace function f( x in varchar2 ) return number
as
begin
        dbms_application_info.set_client_info(userenv('client_info')+1 );
        return length(x);
end;
/

2.建立测试脚本:
create table t as select rownum id1,mod(rownum-1,10000)+1 id2 from dual connect by level<=20000;
--//分析表略.
create table t1 ( a number ,b number);
--//字段a 表示调用函数次数.

--//建立脚本cy.txt
exec dbms_application_info.set_client_info(0);
set term off
exec :x := &&1;
select count(distinct f_id2) from (select id2,(select f(id2) from dual) as f_id2 from t where id2 <= :x );
set term on
insert into t1 values (userenv('client_info') ,:x) ;
commit ;

--//建立shell脚本cy.sh:
#! /bin/bash
sqlplus -s -l scott/book <<EOF >> hz.txt
variable x number;
$(seq 10000 | xargs -I{} echo @cy.txt {})
quit
EOF

3.另外的测试脚本:
--//自己也写PL/SQL脚本.仅仅为了学习...
declare
x number;
  begin
   for i in 1..10000 loop
     dbms_application_info.set_client_info(0);
     select count(distinct f_id2) into x from (select id2,(select f(id2) from dual) as f_id2 from t where id2 <= i );  
     insert into t1 values (userenv('client_info') ,i) ;
     commit ;
   end loop;
end;
/

--//在家里的12c for windows测试看看:
SCOTT@test01p> @ ver1
PORT_STRING                    VERSION        BANNER                                                                               CON_ID
------------------------------ -------------- -------------------------------------------------------------------------------- ----------
IBMPC/WIN_NT64-9.1.0           12.1.0.1.0     Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0

select max(id2) from (
SELECT id2, r, rp
  FROM (  SELECT b id2, a r, LAG (a) OVER (ORDER BY b) rp
            FROM t1
        ORDER BY b)
 WHERE r - rp = 1 order by id2);

MAX(ID2)
----------
      9234
--//说明:我前面的测试id2 记录的数量加倍,而这次记录的是变量:X,这样不要在除以2.

SELECT id2, r, rp
  FROM (  SELECT b  id2, a r, LAG (a) OVER (ORDER BY b) rp
            FROM t1
        ORDER BY b)
 WHERE r - rp >= 2 and id2<=9234 ;
...
8210 rows selected.

--// 9234-8210 = 1024 ,与前面测试一样.

4.验证测试是否正确:
--//执行以上脚本.cy.sh,等....

select max(id2) from (
SELECT id2, r, rp
  FROM (  SELECT b id2, a r, LAG (a) OVER (ORDER BY b) rp
            FROM t1
        ORDER BY b)
 WHERE r - rp = 1 order by id2);
    MAX(ID2)
------------
        9234
--//9234还会进入buckupset,后面的数字带入都是出现hash 冲突的情况.

SELECT id2, r, rp
  FROM (  SELECT b id2, a r, LAG (a) OVER (ORDER BY b) rp
            FROM t1
        ORDER BY b)
 WHERE r - rp >= 2 and id2<=9234 ;
...

8210 rows selected.
 
--//输出太长,一共8210个值,略,这个结果就是在1-9234之间,出现hash冲突的值.
select count(*) from
(SELECT id2, r, rp
  FROM (  SELECT b id2, a r, LAG (a) OVER (ORDER BY b) rp
            FROM t1
        ORDER BY b)
 WHERE r - rp >= 2 and id2<=9234 );

COUNT(*)
------------
        8210

--//9234-8210 = 1024 ,可以看出11.2.0.4标量子查询的哈希表大小是1024个buckets.
--//后面的测试不再做了.