Oracle 在大表上按时间戳 (6) 搜索 (几种方法)

ASKTOM 2019-01-22

1759

问题描述

你好，汤姆，

阅读您对Oracle数据库中不同主题的意见总是很有启发性的。

我们有一个大表，大约一个月增加了2亿条记录。在这个表上，有一些选择每天做一些计算，一个月有一个在整个月运行一些脚本。

数据库: 12.2.0.1

经过一些讨论 (并在互联网上搜索)，我们找到了几种快速选择的解决方案:

Solution one: index on a timestamp(6) column

在此解决方案中，我们在timestamp(6) 列上创建默认索引。在准备好的语句中，我们必须始终传递时间戳。如果我们传递日期，Oracle将不会使用索引。

create table solution1 (
    ts timestamp(6) not null
);
create index idx_solution1 ON solution1 (ts);

-- example select
select * from solution1
where ts >= TIMESTAMP '2019-01-01 00:00:00' and ts < TIMESTAMP '2019-02-01 00:00:00'
order by ts;
复制

Solution two: functional index to_number

此解决方案使用函数索引，该索引将时间戳转换为整数。如果我们使用准备好的语句，则必须以显示方式转换通过的日期。

create table solution2 (
    ts timestamp(6) not null
);
create index idx_solution2 ON solution2 (to_number(to_char(ts,'YYYYMMDD')));

-- example select
select * from solution2
where to_number(to_char(ts,'YYYYMMDD'))
      between to_number(to_char(DATE '2019-01-01','YYYYMMDD'))
          and to_number(to_char(DATE '2019-01-31','YYYYMMDD'));
复制

Solution three: functional index trunc

在net的一些示例中，我们找到了带有trunc() 的函数索引。如果我们使用prepare语句，如果我们传递时间戳，我们必须使用trunc，如果我们使用日期，我们不需要转换它。

create table solution3 (
    ts timestamp(6) not null
);
create index idx_solution3 ON solution3 (trunc(ts));

-- example select
select * from solution3
where trunc(ts)
      between trunc(TIMESTAMP '2019-01-01 00:00:00')
          and trunc(TIMESTAMP '2019-01-31 00:00:00');
复制

Solution four: use additional number field

在解决方案中，我们向表中添加了一个附加列，该列将包含ts字段的日期为YYYYMMDD格式的整数-因此我们可以在数字上构建索引 (8,0)

create table solution4 (
    ts timestamp(6) not null,
    ts_yyyymmdd number(8,0) not null
);
create index idx_solution4 ON solution4 (ts_yyyymmdd);

-- example select
select * from solution4
where ts_yyyymmdd >= 20190101 and ts_yyyymmdd <= 20190131;
复制

Conclusion

我尝试了一些性能测试，但没有找到总是比其他解决方案快得多的解决方案。在文档中，我也找不到提示，时间戳 (6) 上的索引是一个坏主意，即使对于大表，所以我认为这不是问题 (有些人，仍在祈祷其他事实)
我已经搜索了互联网，找到了一个和另一个解决方案 (在不同的上下文中)，但是没有地方有人比较不同的方法。所以我最后在你的办公桌上问了我的问题。

Questions

a) 您会推荐哪种解决方案和wy (或者是否有我们未提及的解决方案)？
b) 有人说，解决方案在很大的表上表现不佳，
因为时间戳 (6) 上的索引将变得非常大，并且可能不适合内存 (如果我们选择一个具有20.000.000行的月份)。他们建议使用解决方案二，这将创建一个更小的索引，并且在大表上会更快-您对此有何看法？

问候，

延斯·米伦霍夫

专家解答

a) 第一种方法-一个普通的B树-是我的首选方法。

这些之间的性能差异将是最小的。毕竟，他们都在读取相同数量的索引条目并访问相同的行。指数的大小可能会生效-但下面会有更多内容。

那么为什么要使用常规索引呢？

这是最一般的。所以将来会支持更多的查询。假设您要搜索较小的时间范围。例如，给定小时内的那些行。常规指数将 “正常工作”。

其他方法可能会退回到全表扫描。至少，他们会读取给定日期的所有索引条目。然后，您必须转到表以过滤掉不需要的行。

也就是说，我会考虑在这个时间戳上分割表。如果您想制定数据归档策略，这也可能对您有所帮助。

不过，您需要先查看对表的其他查询。以及分区将对您的其他索引产生什么影响。

谁说的？他们有什么证据支持这一论点？

让我们比较一下!

因此，我们将创建表格，在其中插入一行，并比较列的大小:

create table solution4 ( 
    ts timestamp(6) not null, 
    ts_yyyymmdd number(8,0) not null 
);

insert into solution4 values ( 
  systimestamp, to_number(to_char(systimestamp,'YYYYMMDD'))
);

select vsize ( ts ), vsize ( ts_yyyymmdd ) 
from   solution4;

VSIZE(TS)   VSIZE(TS_YYYYMMDD)   
         11                    5 复制

所以对于这一行，时间戳是11个字节，而数字表示为5个字节。所以大2倍。

当我们加载数据时会发生什么？

让我们再坚持100万行并创建索引:

insert into solution4
  with rws as (
    select systimestamp + numtodsinterval ( level, 'second' ) ts
    from   dual
    connect by level <= 1000
  )
    select r1.ts,  
           to_number(to_char(r1.ts,'YYYYMMDD'))
    from   rws r1
    cross  join rws;
  
commit;

create index idx_solution4 ON solution4 (ts_yyyymmdd);
create index idx_solution4ts ON solution4 (ts);

exec dbms_stats.gather_table_stats ( user, 'solution4' ) ;
  
select count(*) from solution4;

COUNT(*)   
   1000001复制

那么时间戳索引到底要大多少呢？

select index_name, blevel, leaf_blocks 
from   user_indexes
where  table_name = 'SOLUTION4';

INDEX_NAME        BLEVEL   LEAF_BLOCKS   
IDX_SOLUTION4            2          2370 
IDX_SOLUTION4TS          2          3206 

select segment_name, bytes / 1024 / 1024 
from   dba_segments
where  segment_name in (
  select index_name from user_indexes
  where  table_name = 'SOLUTION4'
);

SEGMENT_NAME      BYTES/1024/1024   
IDX_SOLUTION4                    19 
IDX_SOLUTION4TS                  26复制

因此，就叶块和段大小而言，它仅大约1.3倍。

好的，它更大。但7Mb大100万行？你真的在乎吗？

显然，如果你真的在乎，测试和验证你自己的数据。但是我怀疑您会从首先调整系统的其他部分中获得更大的收益。

oracle asktom

文章转载自ASKTOM，如果涉嫌侵权，请发送邮件至：contact@modb.pro进行举报，并提供相关证据，一经查实，墨天轮将立刻删除相关内容。

Oracle 在大表上按时间戳 (6) 搜索 (几种方法)

问题描述

专家解答

评论

相关阅读