一、Dynamic Sampling概念
The purpose of dynamic sampling is to improve server performance by determining more accurate selectivity and cardinality estimates.
More accurate selectivity and cardinality estimates allow the optimizer to produce better performing plans.
(动态采样的目的是为了通过更精确的seletivity值cardinality值来提供服务器性能,更精确的seletivity值cardinality值可以让优化器提供更好的执行计划。)
Estimate single-table predicate selectivities when collected statistics cannot be used or are likely to lead to significant errors in estimation.
Estimate table cardinality for tables without statistics or for tables whose statistics are too out of date to trust.
(当没有使用statistics统计信息或者可能导致评估错误的时候,可以提前预估出来单表的selectivities值。当没有收集统计信息时,或者表的统计信息过期的时候,可以估算出表的cardinality值。)
二、optimizer_dynamic_sampling
该参数是用来设定Oracle动态采样的级别。
Oracle为动态采样划分了11个级别,在Oracle的官方文档上有详细的介绍,这里列举如下:
Level 0:dynamic sampling will not be done.
(动态采样不会收集)
Level 1:dynamic sampling will be performed if all of the following conditions are true:
(1)、There is more than one table in the query.
(2)、Some table has not been analyzed and has no indexes.
(3)、The optimizer determines that a relatively expensive table scan would be required for this unanalyzed table.
(如果下列条件全部满足的时候,那么动态采样将被执行:
(1)、有超过一个表的查询;
(2)、一些表没有被分析,而且没有index;
(3)、优化器认为这个没有被分析的表会消耗相当昂贵的表扫描资源。)
Level 2:Apply dynamic sampling to all unanalyzed tables.
The number of blocks sampled is the default number of dynamic sampling blocks.
(针对所有没有被分析的表应用动态采样,采样blocks的数量是默认的动态采样的数量)
Level 3:Apply dynamic sampling to all tables that meet Level 2 criteria,plus all tables for which standard selectivity estimation used a guess for some predicate that is a potential dynamic sampling predicate.
The number of blocks sampled is the default number of dynamic sampling blocks.
(根据level 2的标准,应用动态采样到所有的表,以及为一些标准selectivity值的表使用一些采样预测,采样blocks的数量是默认的动态采样的数量)
Level 4:Apply dynamic sampling to all tables that meet level 3 criteria,plus all tables that have sigle-table predicates that reference 2 or more columns.
The number of blocks sampled is the default number of dynamic sampling blocks.
(根据level 3标准,应用动态采样到所有的表,以及一些大于2列的单表的预测。采样blocks的数量是默认的动态采样的数量)
Level 5~9:采样的表包含满足Level 4定义的所有表,同时分别使用动态采样默认数据块数的2、4、8、32、128倍的数量来进行动态采样。
Level 10:Apply dynamic sampling to all tables that meet the level 9 criteria using all blocks in the table.
(根据Level 9的标准,应用动态采样到所有的表,并且采样表中所有的blocks)
在这里看一下Reference对该参数的说明:
三、关于该参数的例子
----创建jack表,进行一次全表扫描---- 1 SQL> create table jack as select * from dba_objects; 2 3 Table created. 4 5 SQL> set autotrace trace exp; 6 7 SQL> select count(*) from jack; 8 9 Execution Plan10 ----------------------------------------------------------11 Plan hash value: 120502350112 13 -------------------------------------------------------------------14 | Id | Operation | Name | Rows | Cost (%CPU)| Time |15 -------------------------------------------------------------------16 | 0 | SELECT STATEMENT | | 1 | 186 (1)| 00:00:03 |17 | 1 | SORT AGGREGATE | | 1 | | |18 | 2 | TABLE ACCESS FULL| JACK | 59354 | 186 (1)| 00:00:03 |19 -------------------------------------------------------------------20 21 Note22 -----23 - dynamic sampling used for this statement (level=2) ----数据库默认下level=224 ----将参数改成4,进行查询---- 25 SQL> alter session set optimizer_dynamic_sampling=4;26 27 Session altered.28 29 SQL> select count(*) from jack;30 31 Execution Plan32 ----------------------------------------------------------33 Plan hash value: 120502350134 35 -------------------------------------------------------------------36 | Id | Operation | Name | Rows | Cost (%CPU)| Time |37 -------------------------------------------------------------------38 | 0 | SELECT STATEMENT | | 1 | 186 (1)| 00:00:03 |39 | 1 | SORT AGGREGATE | | 1 | | |40 | 2 | TABLE ACCESS FULL| JACK | 59354 | 186 (1)| 00:00:03 |41 -------------------------------------------------------------------42 43 Note44 -----45 - dynamic sampling used for this statement (level=4) ----level=4的时候Rows值没有发生变化46 ----将参数改成8,---- 47 SQL> alter session set optimizer_dynamic_sampling=8;48 49 Session altered.50 51 SQL> select count(*) from jack;52 53 Execution Plan54 ----------------------------------------------------------55 Plan hash value: 120502350156 57 -------------------------------------------------------------------58 | Id | Operation | Name | Rows | Cost (%CPU)| Time |59 -------------------------------------------------------------------60 | 0 | SELECT STATEMENT | | 1 | 186 (1)| 00:00:03 |61 | 1 | SORT AGGREGATE | | 1 | | |62 | 2 | TABLE ACCESS FULL| JACK | 72320 | 186 (1)| 00:00:03 |63 -------------------------------------------------------------------64 65 Note66 -----67 - dynamic sampling used for this statement (level=8) ----当参数为8时,动态采样出来的Rows已经很接近实际值了。68 ----进行一次信息统计---- 69 SQL> exec dbms_stats.gather_table_stats(user,'jack');70 71 PL/SQL procedure successfully completed.72 73 SQL> select count(*) from jack;74 75 Execution Plan76 ----------------------------------------------------------77 Plan hash value: 120502350178 79 -------------------------------------------------------------------80 | Id | Operation | Name | Rows | Cost (%CPU)| Time |81 -------------------------------------------------------------------82 | 0 | SELECT STATEMENT | | 1 | 186 (1)| 00:00:03 |83 | 1 | SORT AGGREGATE | | 1 | | |84 | 2 | TABLE ACCESS FULL| JACK | 72542 | 186 (1)| 00:00:03 |85 ------------------------------------------------------------------- ----进行信息统计以后就不使用动态采样了,而且计算出来的值非常接近实际值 86 ----最后查看一下实际值是多少----87 SQL> set autotrace off;88 SQL> select count(*) from jack;89 90 COUNT(*)91 ----------92 72542
上面的情况是使用alter session的语句改变参数的大小,下面看一下使用hint的情况:
(备注:alter session 和alter system 的效果是一样的。)
----创建跟jack表一样的echo表---- 1 SQL> create table echo as select * from jack; 2 3 Table created. 4 5 SQL> set autotrace trace exp; 6 ----通过hint设置level为2---- 7 SQL> select /*+ dynamic_sampling(echo 2) */ count(*) from echo; 8 9 Execution Plan10 ----------------------------------------------------------11 Plan hash value: 9910917612 13 -------------------------------------------------------------------14 | Id | Operation | Name | Rows | Cost (%CPU)| Time |15 -------------------------------------------------------------------16 | 0 | SELECT STATEMENT | | 1 | 186 (1)| 00:00:03 |17 | 1 | SORT AGGREGATE | | 1 | | |18 | 2 | TABLE ACCESS FULL| ECHO | 63977 | 186 (1)| 00:00:03 |19 -------------------------------------------------------------------20 21 Note22 -----23 - dynamic sampling used for this statement (level=8)24 25 SQL> select /*+ dynamic_sampling(echo 4) */ count(*) from echo;26 27 Execution Plan28 ----------------------------------------------------------29 Plan hash value: 9910917630 31 -------------------------------------------------------------------32 | Id | Operation | Name | Rows | Cost (%CPU)| Time |33 -------------------------------------------------------------------34 | 0 | SELECT STATEMENT | | 1 | 185 (0)| 00:00:03 |35 | 1 | SORT AGGREGATE | | 1 | | |36 | 2 | TABLE ACCESS FULL| ECHO | 15806 | 185 (0)| 00:00:03 |37 -------------------------------------------------------------------38 39 Note40 -----41 - dynamic sampling used for this statement (level=8)42 43 SQL> select /*+ dynamic_sampling(echo 8) */ count(*) from echo;44 45 Execution Plan46 ----------------------------------------------------------47 Plan hash value: 9910917648 49 -------------------------------------------------------------------50 | Id | Operation | Name | Rows | Cost (%CPU)| Time |51 -------------------------------------------------------------------52 | 0 | SELECT STATEMENT | | 1 | 186 (1)| 00:00:03 |53 | 1 | SORT AGGREGATE | | 1 | | |54 | 2 | TABLE ACCESS FULL| ECHO | 72542 | 186 (1)| 00:00:03 |55 -------------------------------------------------------------------56 57 Note58 -----59 - dynamic sampling used for this statement (level=8) 总结: 当level=4时,Rows的值居然变小了,这是一个很奇怪的问题。不过当level=8时,Rows值已经接近实际值了。
由此我们可以看出OPTIMIZER_DYNAMIC_SAMPLING parameter 效果不同于 DYNAMIC_SAMPLING query hint的,在这里就不继续讨论了,如果想了解的话可以看Loucs的
总结:
动态采样也需要额外的消耗数据库资源,所以,在一个这样的环境中,SQL被反复的执行,变量被绑定,硬分析很少----是不宜使用动态采样的,就像OLTP系统。动态采样发生的硬分析时候,如果很少有硬分析发生,动态采样的意义不大。
而OLAP或者数据仓库环境下,SQL的执行消耗的资源要远远大于SQL解析,那么让解析再消耗多一点资源做一些动态采样分析,从而做出一个最优的执行计划是非常值得的。建议,在OLAP或者数据仓库环境中,将动态采样的Level设置为3或者4是比较好,相反,在一个OLTP系统下,不应该使用动态采样。