Treedump 2

原创 Jonathan Lewis 2020-05-10

894

在之前一篇关于详细研究索引状态的文章（点此链接）中，我提供了一段SQL语句，它将分析索引（不使用Analyze命令），并总结每个叶块中当前保存条目的条目数。下面是它产生的输出类型示例：

ROWS_PER_BLOCK     BLOCKS TOTAL_ROWS RUNNING_TOTAL
-------------- ---------- ---------- -------------
             1       8706       8706          8706
             2       4830       9660         13536
             3       2953       8859         16489
             4       1648       6592         18137
             5        997       4985         19134
             6        628       3768         19762
             7        300       2100         20062
             8        162       1296         20224
             9         87        783         20311
            10         52        520         20363
...
           227        100      22700         31871
           228        111      25308         31982
           229         84      19236         32066
           230         97      22310         32163
           231         77      17787         32240
           232         61      14152         32301
           233         54      12582         32355
           234        529     123786         32884
               ---------- ----------
sum                 32884    1887595

这显然很有用，因为它告诉你很多关于索引中使用空间的方式，但它只告诉你在任何给定状态下有多少个块，而不告诉你这些块在索引中的位置。如果你能“画一张图”来显示索引的空洞和密集的地方，那就太好了。

几周前，我在等出租车把我从客户现场送到当地火车站时，突然意识到这是个简单的问题。Oracle为您提供了所需的工具。我坐火车回伦敦只有一个小时的车程，这给了我足够的时间来做一个快速的测试来演示这一原理。***

我们从一篇关于treedump命令的文章（点此链接）开始，这篇文章是我不久前发表的。转储了一个描述索引的跟踪文件，按正确的索引顺序为每个索引块转储一行。

----- begin tree dump
branch: 0x40035a 4195162 (0: nrow: 3, level: 2)
   branch: 0x40e2c0 4252352 (-1: nrow: 312, level: 1)
      leaf: 0x40035b 4195163 (-1: nrow: 240 rrow: 240)
      leaf: 0x40215f 4202847 (0: nrow: 218 rrow: 218)
      leaf: 0x40035c 4195164 (1: nrow: 444 rrow: 444)
      leaf: 0x40035d 4195165 (2: nrow: 444 rrow: 444)
       ... etc ...

这是问题的关键所在；您所要做的就是读取跟踪文件，并且你已经知道了所有需要知道的信息——除了它是每个叶块一行，而且对于一个大索引来说，这是一个很大的读取量，这将使您很难发现应该关注的点。

但它是一个跟踪文件——我们可以通过使用外部表从SQL中读取跟踪文件；我们还可以通过使用诸如ntile（）函数之类的分析函数将数据压缩到更合理的大小。所以这里有一个小脚本，它把一些简单的片段放在一起，并成为了一个相当有用的工具。

rem
rem Script:     read_treedump.sql
rem Dated:      Feb 2010
rem Author:     Jonathan Lewis
rem Purpose:    Using external files to analyze an index.
rem
rem Last tested:
rem     10.2.0.3
rem Not tested
rem     11.2.0.1
rem     11.1.0.7
rem      9.2.0.8
rem Not relevant
rem      8.1.7.4    -- no external tables
rem
rem Notes:
rem See read_trace.sql for notes on reading trace files.
rem
rem Required Privileges:
rem grant create any directory to the user.
rem
 
connect test_user/test
 
start setenv
set timing off
 
execute dbms_random.seed(0)
 
set echo on
drop table t1;
set echo off
 
begin
    begin       execute immediate 'purge recyclebin';
    exception   when others then null;
    end;
 
    begin
        dbms_stats.set_system_stats('MBRC',8);
        dbms_stats.set_system_stats('MREADTIM',26);
        dbms_stats.set_system_stats('SREADTIM',12);
        dbms_stats.set_system_stats('CPUSPEED',800);
    exception
        when others then null;
    end;
 
    begin       execute immediate 'begin dbms_stats.delete_system_stats; end;';
    exception   when others then null;
    end;
 
    begin       execute immediate 'alter session set "_optimizer_cost_model"=io';
    exception   when others then null;
    end;
end;
/
 
--  ------------------------
--  Create a table and index
--  ------------------------
 
define m_size=100000
 
create table t1
nologging
as
with generator as (
    select  --+ materialize
        rownum id
    from dual
    connect by
        rownum <= 10000
)
select
    rownum          id,
    lpad(rownum,10,'0') vc_id
from
    generator   v1,
    generator   v2
where
    rownum <= &m_size
;
 
create index t1_i1 on t1(vc_id) nologging;
 
--  ---------------------------------------------------------
--  Create a "FIFO" pattern of space in the index leaf blocks
--  ---------------------------------------------------------
 
delete from t1
where
    id between 1 and 0.5 * &m_size
and mod(id,100) != 0
;
 
commit;
 
delete from t1
where
    id between  0.5 * &m_size and 0.75 * &m_size
and mod(id,50) != 0
;
 
commit;
 
delete from t1
where
    id between  0.75 * &m_size and 0.875 * &m_size
and mod(id,25) != 0
;
 
commit;
 
delete from t1
where
    id between  0.875 * &m_size and 0.9375 * &m_size
and mod(id,12) != 0
;
 
commit;
 
delete from t1
where
    id between  0.9375 * &m_size and 0.96875 * &m_size
and mod(id,6) != 0
;
 
commit;
 
--  -------------------
--  Dump the index tree
--  -------------------
 
column  object_id new_value m_object_id
 
select
    object_id
from
    user_objects
where
    object_name = 'T1_I1'
;
 
alter session set events 'immediate trace name treedump level &m_object_id';
 
--  -------------------------------
--  Now sort out the trace filename
--  It's easier in 11g
--  -------------------------------
 
column spid new_value m_spid
 
select
    trim(pro.spid)  spid
from
    v$process   pro,
    v$session   ses
where
    pro.addr = ses.paddr
and ses.sid  = (
        select  sid
        from    v$mystat
        where   rownum = 1
    )
;
 
define m_filename = 'd10g_ora_&m_spid..trc'
prompt Trace file name: &m_filename
 
--  ------------------------------------------
--  Set up an external table to read the trace
--  ------------------------------------------
 
define m_file_dir = 'c:\oracle\admin\d10g\udump'
define m_log_dir  = 'c:\working';
 
create or replace directory ext_tab as '&m_file_dir';
create or replace directory log_dir as '&m_log_dir';
 
drop table ext;
 
create table ext(text_line varchar(4000))
organization external
(
    type oracle_loader
    default directory ext_tab
    access parameters (
        records delimited by newline
        characterset us7ascii
        discardfile log_dir:'read_trace_dis.txt'
        badfile     log_dir:'read_trace_bad.txt'
        logfile     log_dir:'read_trace_log.txt'
        fields
            rtrim
            reject rows with all null fields
        (
            text_line   (1:4000)    char
        )
    )
    location ('&m_filename')
)   reject limit unlimited
;
 
--  -----------------------------------------------------
--
--  Read the treedump in order, selecting only lines with
--  "leaf" in them.
--  branch: 0x140020a 20972042 (0: nrow: 307, level: 1)
--     leaf: 0x140020b 20972043 (-1: nrow: 326 rrow: 3)
--     leaf: 0x140020c 20972044 (0: nrow: 326 rrow: 3)
--
--  Use instr() to get the text following the last ": ",
--  and select rownum to get an ordering on the lines.
--
--  Use substr() and length() to trim off the final ")"
--  then turn the result into a number and select ntile()
--  to collect the numbers into ordered batches.
--
--  Finally sum the batches - I've used 50 tiles because
--  that's a page length of numbers, you could use 100 or
--  more if you selected the result into a spreadsheet to
--  graph the results.
--
--  -----------------------------------------------------
 
spool read_treedump
 
set pagesize  60
 
select
    section,
    sum(leaf_row_count) row_count
from
    (
    select
        ntile(50) over (order by rn)    section,
        to_number(
            substr(
                text_line,
                1,
                length(text_line)-1
            )
        )   leaf_row_count
    from
        (
        select
            rownum rn,
            substr(
                text_line,
                instr(text_line,':',-1) + 2
            )           text_line
        from
            ext
        where
            instr(text_line,'leaf') != 0
        )
    )
group by
    section
order by
    section
;
 
spool off

如果你想做更多的事情，你可以把它变成一个管道函数，把索引的SCHEMA和名称作为输入，加上标题的数量，然后返回数据集。您需要自定义代码来标识跟踪文件的位置，并且可能对目录有一个固定的设置，而不是在函数中动态创建目录。完成此操作后，可以使用“select from table_function（）”调用电子表格，并将数据直接拉入图表。

这是我从这个演示案例中得到的结果。如您所见，它显示索引以一个很长的尾巴开始，尾巴很少填充，并且很大一部分数据被压缩到索引的最后10%中，其中大部分在索引的右侧4%中。

   SECTION  ROW_COUNT
---------- ----------
         1         22
         2         23
         3         23
         4         23
         5         23
         6         22
         7         23
         8         20
         9         19
        10         20
        11         19
        12         20
        13         20
        14         19
        15         20
        16         19
        17         20
        18         19
        19         20
        20         20
        21         19
        22         20
        23         19
        24         20
        25         31
        26         39
        27         39
        28         40
        29         39
        30         39
        31         39
        32         39
        33         39
        34         39
        35         39
        36         39
        37         40
        38         71
        39         78
        40         78
        41         79
        42         78
        43         78
        44        115
        45        163
        46        163
        47        201
        48        326
        49       1368
        50       1874
           ----------
sum              5665

提醒：treedump按顺序遍历索引，一次读取一个块-这可能需要大量工作和时间。在有些版本中，它会对索引中的每个叶块执行完整的符号块转储，因此在尝试转储大型索引之前，请在非常小的索引上测试它。

*** 注:
和我在这个博客上发布的许多脚本一样，这里显示的SQL对我来说“足够好”了，因为我知道它是如何工作的，我不必经常使用它。也并不打算做成在生产状态下使用的验证程序。

2010年12月更新：（见下面的评论7）treedump的每一行大约75字节长，因此整个跟踪文件的大小将在（75*leaf_blocks/1048576）MB的区域内。检查参数max_dump_file_size。确保在开始之前可以转储该大小的跟踪文件。

原文链接和内容如下：

Treedump – 2
Filed under: Indexing,Infrastructure,Oracle,trace files,Troubleshooting — Jonathan Lewis @ 6:32 pm GMT Mar 7,2010
In an earlier article about investigating the state of an index in detail I supplied a piece of SQL that would analyse an index (no, not using the Analyze command) and summarise the number of entries in each leaf block that currently held any entries at all. Here’s a sample of the type of output it produced:

ROWS_PER_BLOCK     BLOCKS TOTAL_ROWS RUNNING_TOTAL
-------------- ---------- ---------- -------------
             1       8706       8706          8706
             2       4830       9660         13536
             3       2953       8859         16489
             4       1648       6592         18137
             5        997       4985         19134
             6        628       3768         19762
             7        300       2100         20062
             8        162       1296         20224
             9         87        783         20311
            10         52        520         20363
...
           227        100      22700         31871
           228        111      25308         31982
           229         84      19236         32066
           230         97      22310         32163
           231         77      17787         32240
           232         61      14152         32301
           233         54      12582         32355
           234        529     123786         32884
               ---------- ----------
sum                 32884    1887595

This is useful of course as it tells you a lot about the way that space is used in the index – but it only tells you how many blocks are in any given state, it doesn’t tell you whereabouts in the index those blocks are. It would be really nice if you could “draw a picture” of the index, showing where the gaps were and where it was densely packed.

I was thinking about this a few weeks ago, while waiting for a taxi-cab to take me from a client site to the local train station, and suddenly realised that it was a simple problem. Oracle gives you exactly the tools you need. My train ride back to London was only an hour long, and it gave me just enough time to generate a quick test to demonstrate the principle. ***

We start with an article on the treedump command, which I published a little while ago. This dumps a tracefile describing your index, one line per block of the index in correct index order.

----- begin tree dump
branch: 0x40035a 4195162 (0: nrow: 3, level: 2)
   branch: 0x40e2c0 4252352 (-1: nrow: 312, level: 1)
      leaf: 0x40035b 4195163 (-1: nrow: 240 rrow: 240)
      leaf: 0x40215f 4202847 (0: nrow: 218 rrow: 218)
      leaf: 0x40035c 4195164 (1: nrow: 444 rrow: 444)
      leaf: 0x40035d 4195165 (2: nrow: 444 rrow: 444)
       ... etc ...

This is the key to the problem; all you have to do is read the trace file and you’ve got all you need to know – except that it’s one line per leaf block, and that’s a lot of reading for a large index and that will make it hard to spot interesting patterns.

But it’s a trace file – and we can read trace files from SQL by making use of external tables; and we can collapse data down to a more reasonable size by using analytic functions like the ntile() function. So here’s a little script that puts a few simple pieces together and comes up with a fairly useful tool.

rem
rem Script:     read_treedump.sql
rem Dated:      Feb 2010
rem Author:     Jonathan Lewis
rem Purpose:    Using external files to analyze an index.
rem
rem Last tested:
rem     10.2.0.3
rem Not tested
rem     11.2.0.1
rem     11.1.0.7
rem      9.2.0.8
rem Not relevant
rem      8.1.7.4    -- no external tables
rem
rem Notes:
rem See read_trace.sql for notes on reading trace files.
rem
rem Required Privileges:
rem grant create any directory to the user.
rem
 
connect test_user/test
 
start setenv
set timing off
 
execute dbms_random.seed(0)
 
set echo on
drop table t1;
set echo off
 
begin
    begin       execute immediate 'purge recyclebin';
    exception   when others then null;
    end;
 
    begin
        dbms_stats.set_system_stats('MBRC',8);
        dbms_stats.set_system_stats('MREADTIM',26);
        dbms_stats.set_system_stats('SREADTIM',12);
        dbms_stats.set_system_stats('CPUSPEED',800);
    exception
        when others then null;
    end;
 
    begin       execute immediate 'begin dbms_stats.delete_system_stats; end;';
    exception   when others then null;
    end;
 
    begin       execute immediate 'alter session set "_optimizer_cost_model"=io';
    exception   when others then null;
    end;
end;
/
 
--  ------------------------
--  Create a table and index
--  ------------------------
 
define m_size=100000
 
create table t1
nologging
as
with generator as (
    select  --+ materialize
        rownum id
    from dual
    connect by
        rownum <= 10000
)
select
    rownum          id,
    lpad(rownum,10,'0') vc_id
from
    generator   v1,
    generator   v2
where
    rownum <= &m_size
;
 
create index t1_i1 on t1(vc_id) nologging;
 
--  ---------------------------------------------------------
--  Create a "FIFO" pattern of space in the index leaf blocks
--  ---------------------------------------------------------
 
delete from t1
where
    id between 1 and 0.5 * &m_size
and mod(id,100) != 0
;
 
commit;
 
delete from t1
where
    id between  0.5 * &m_size and 0.75 * &m_size
and mod(id,50) != 0
;
 
commit;
 
delete from t1
where
    id between  0.75 * &m_size and 0.875 * &m_size
and mod(id,25) != 0
;
 
commit;
 
delete from t1
where
    id between  0.875 * &m_size and 0.9375 * &m_size
and mod(id,12) != 0
;
 
commit;
 
delete from t1
where
    id between  0.9375 * &m_size and 0.96875 * &m_size
and mod(id,6) != 0
;
 
commit;
 
--  -------------------
--  Dump the index tree
--  -------------------
 
column  object_id new_value m_object_id
 
select
    object_id
from
    user_objects
where
    object_name = 'T1_I1'
;
 
alter session set events 'immediate trace name treedump level &m_object_id';
 
--  -------------------------------
--  Now sort out the trace filename
--  It's easier in 11g
--  -------------------------------
 
column spid new_value m_spid
 
select
    trim(pro.spid)  spid
from
    v$process   pro,
    v$session   ses
where
    pro.addr = ses.paddr
and ses.sid  = (
        select  sid
        from    v$mystat
        where   rownum = 1
    )
;
 
define m_filename = 'd10g_ora_&m_spid..trc'
prompt Trace file name: &m_filename
 
--  ------------------------------------------
--  Set up an external table to read the trace
--  ------------------------------------------
 
define m_file_dir = 'c:\oracle\admin\d10g\udump'
define m_log_dir  = 'c:\working';
 
create or replace directory ext_tab as '&m_file_dir';
create or replace directory log_dir as '&m_log_dir';
 
drop table ext;
 
create table ext(text_line varchar(4000))
organization external
(
    type oracle_loader
    default directory ext_tab
    access parameters (
        records delimited by newline
        characterset us7ascii
        discardfile log_dir:'read_trace_dis.txt'
        badfile     log_dir:'read_trace_bad.txt'
        logfile     log_dir:'read_trace_log.txt'
        fields
            rtrim
            reject rows with all null fields
        (
            text_line   (1:4000)    char
        )
    )
    location ('&m_filename')
)   reject limit unlimited
;
 
--  -----------------------------------------------------
--
--  Read the treedump in order, selecting only lines with
--  "leaf" in them.
--  branch: 0x140020a 20972042 (0: nrow: 307, level: 1)
--     leaf: 0x140020b 20972043 (-1: nrow: 326 rrow: 3)
--     leaf: 0x140020c 20972044 (0: nrow: 326 rrow: 3)
--
--  Use instr() to get the text following the last ": ",
--  and select rownum to get an ordering on the lines.
--
--  Use substr() and length() to trim off the final ")"
--  then turn the result into a number and select ntile()
--  to collect the numbers into ordered batches.
--
--  Finally sum the batches - I've used 50 tiles because
--  that's a page length of numbers, you could use 100 or
--  more if you selected the result into a spreadsheet to
--  graph the results.
--
--  -----------------------------------------------------
 
spool read_treedump
 
set pagesize  60
 
select
    section,
    sum(leaf_row_count) row_count
from
    (
    select
        ntile(50) over (order by rn)    section,
        to_number(
            substr(
                text_line,
                1,
                length(text_line)-1
            )
        )   leaf_row_count
    from
        (
        select
            rownum rn,
            substr(
                text_line,
                instr(text_line,':',-1) + 2
            )           text_line
        from
            ext
        where
            instr(text_line,'leaf') != 0
        )
    )
group by
    section
order by
    section
;
 
spool off

If you feel like making more of it, you could probably turn it into a pipe-lined function taking the index schema and name as an input, together with the number of tiles, and returning the data set. You’d need to customise the code to identify the trace file location, and probably have a fixed setup for directories rather than creating them on the fly in the function. Once you’ve done this you could then set up a spreadsheet with a ‘select from table_function()’ call and pull the data straight into a chart.

Here’s the output I got from this demonstration case. As you can see, it shows that the index starts with a great long tail that is thinly populated and that a large fraction of the data is packed into the last 10% of the index, with most of that being in the right-hand 4% of the index.

   SECTION  ROW_COUNT
---------- ----------
         1         22
         2         23
         3         23
         4         23
         5         23
         6         22
         7         23
         8         20
         9         19
        10         20
        11         19
        12         20
        13         20
        14         19
        15         20
        16         19
        17         20
        18         19
        19         20
        20         20
        21         19
        22         20
        23         19
        24         20
        25         31
        26         39
        27         39
        28         40
        29         39
        30         39
        31         39
        32         39
        33         39
        34         39
        35         39
        36         39
        37         40
        38         71
        39         78
        40         78
        41         79
        42         78
        43         78
        44        115
        45        163
        46        163
        47        201
        48        326
        49       1368
        50       1874
           ----------
sum              5665

Reminder: a treedump walks the index in order, reading one block at a time – this can be a lot of work and take a lot of time. There are also versions where it does a full symbolic block dump for every leaf in the index, so test it on a very small index before trying to dump a large index.

*** Footnote: like many of the scripts I’ve published on this blog, the SQL shown here is “good enough” for me to use because I know how it works and I won’t have to use it often. It’s not intended to be a bullet-proof program in a “productised” state.

Update Dec 2010: (See comment 7 below) each line of the treedump is about 75 bytes long – so the whole trace file will be in the region of (75 * leaf_blocks / 1048576)MB in size. Make sure that you can dump a trace file that size before you start. Check the parameter max_dump_file_size.

最后修改时间：2020-05-10 11:59:02

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者