暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Oracle DBMS_COMPARISON: ora-23626: 'schema.indexname' 不符合条件的索引错误

askTom 2017-03-03
281

问题描述

嗨,汤姆,

我有一个非常大的表,有超过8.5亿行的数据。我们正在使用CDC将数据从源系统提取到发布目标,并将etl提取到数据仓库和ODS。我需要运行定期检查,以确保两个系统之间的数据同步,如果不同步,则使它们同步。

由于数据规模庞大,我认为DBMS_COMPARISON.CREATE_COMPARISON将是一个很好的候选人。但是,当我尝试运行它时,出现错误: ora-23626: 'schema.indexname' 不合格索引错误。通过文档,我发现表的主键是由7列组成的组合,具有数字和nchar数据类型的组合。结果,它没有通过DBMS_COMPARISON的要求。

你还能推荐什么?减号查询是我唯一的选择吗?

非常感谢你的帮助。
邦妮

专家解答

这里有一些你可以探索的选择

SQL> create table t1 as select * from dba_objects where object_id is not null;

Table created.

SQL> create table t2 as select * from t1;

Table created.

--
-- Now I'll 'manipulate' t2 so its slightly different from t1
--
SQL> delete from t2 where owner = 'SCOTT' and object_name = 'EMP';

1 row deleted.

SQL> update t2 set owner = 'SCOTT2' where object_name = 'DEPT' and owner = 'SCOTT';

1 row updated.
复制


所以我们从传统的减号开始

SQL> set autotrace on
SQL> select *
  2  from
  3  (
  4  select * from t1
  5  minus
  6  select * from t2
  7  union all
  8  select * from t2
  9  minus
 10  select * from t1
 11  );

OWNER                          OBJECT_NAME
------------------------------ ----------------------------------------
SUBOBJECT_NAME
----------------------------------------------------------------------------------------------------
 OBJECT_ID DATA_OBJECT_ID OBJECT_TYPE             CREATED   LAST_DDL_ TIMESTAMP           STATUS  T
---------- -------------- ----------------------- --------- --------- ------------------- ------- -
G S  NAMESPACE
- - ----------
EDITION_NAME
----------------------------------------------------------------------------------------------------
SHARING       E O
------------- - -
SCOTT2                         DEPT

     92609         170186 TABLE                   20-NOV-15 02-MAR-17 2015-11-20:15:05:06 VALID   N
N N          1

NONE            N


1 row selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 633555309

----------------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |   208K|    73M|       | 12003   (1)| 00:00:01 |
|   1 |  VIEW                   |      |   208K|    73M|       | 12003   (1)| 00:00:01 |
|   2 |   MINUS                 |      |       |       |       |            |          |
|   3 |    SORT UNIQUE          |      |   208K|    45M|    62M| 12003   (1)| 00:00:01 |
|   4 |     UNION-ALL           |      |       |       |       |            |          |
|   5 |      MINUS              |      |       |       |       |            |          |
|   6 |       SORT UNIQUE       |      |   104K|    11M|    15M|  3001   (1)| 00:00:01 |
|   7 |        TABLE ACCESS FULL| T1   |   104K|    11M|       |   313   (1)| 00:00:01 |
|   8 |       SORT UNIQUE       |      |   104K|    11M|    15M|  3001   (1)| 00:00:01 |
|   9 |        TABLE ACCESS FULL| T2   |   104K|    11M|       |   313   (1)| 00:00:01 |
|  10 |      TABLE ACCESS FULL  | T2   |   104K|    11M|       |   313   (1)| 00:00:01 |
|  11 |    SORT UNIQUE          |      |   104K|    11M|    15M|  3001   (1)| 00:00:01 |
|  12 |     TABLE ACCESS FULL   | T1   |   104K|    11M|       |   313   (1)| 00:00:01 |
----------------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       7020  consistent gets
          0  physical reads
          0  redo size
       1856  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          4  sorts (memory)
          0  sorts (disk)
          1  rows processed
复制


问题是-我们能做得更好吗?数据库可以是最好的事情之一是哈希连接。因此 (以更多的SQL代码为代价),我们可以使用完整的外部连接语法进行类似的比较。在下面的示例中,我们假设表的逻辑主键是 “object_id”。

SQL> select *
  2  from (
  3    select t1.object_id t1r, t2.object_id t2r
  4    from t1 full outer join t2
  5    on t1.object_id = t2.object_id
  6  )
  7  where t1r is null or t2r is null;

       T1R        T2R
---------- ----------
     92608

1 row selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 53297166

------------------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |   104K|  2653K|       |   793   (1)| 00:00:01 |
|*  1 |  VIEW                 | VW_FOJ_0 |   104K|  2653K|       |   793   (1)| 00:00:01 |
|*  2 |   HASH JOIN FULL OUTER|          |   104K|  1020K|  1736K|   793   (1)| 00:00:01 |
|   3 |    TABLE ACCESS FULL  | T1       |   104K|   510K|       |   312   (1)| 00:00:01 |
|   4 |    TABLE ACCESS FULL  | T2       |   104K|   510K|       |   312   (1)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T1"."OBJECT_ID" IS NULL OR "T2"."OBJECT_ID" IS NULL)
   2 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       3510  consistent gets
          0  physical reads
          0  redo size
        605  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
复制


现在,这是更快的方式,并且 “工作”,因为它发现缺少 * 行 *,但它没有找到在源和目标中都存在键但数据不同的行。如果这是一部戏剧,那么我们可以使用非键列来增强查询,并使用一些解码进行比较。

SQL>
SQL> select *
  2  from (
  3    select t1.object_id t1r, t2.object_id t2r, t1.owner t1o, t2.owner t2o
  4    from t1 full outer join t2
  5    on t1.object_id = t2.object_id
  6  )
  7  where
  8    t1r is null or
  9    t2r is null or
 10    decode(t1o,t2o,1,0)=0;

       T1R        T2R
---------- ----------
T1O
----------------------------------------------------------------------------------------------------
T2O
----------------------------------------------------------------------------------------------------
     92609      92609
SCOTT
SCOTT2

     92608
SCOTT



2 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 53297166

------------------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |   104K|    15M|       |   863   (1)| 00:00:01 |
|*  1 |  VIEW                 | VW_FOJ_0 |   104K|    15M|       |   863   (1)| 00:00:01 |
|*  2 |   HASH JOIN FULL OUTER|          |   104K|  2449K|  2456K|   863   (1)| 00:00:01 |
|   3 |    TABLE ACCESS FULL  | T1       |   104K|  1224K|       |   312   (1)| 00:00:01 |
|   4 |    TABLE ACCESS FULL  | T2       |   104K|  1224K|       |   312   (1)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T1"."OBJECT_ID" IS NULL OR "T2"."OBJECT_ID" IS NULL OR
              DECODE("T1"."OWNER","T2"."OWNER",1,0)=0)
   2 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       3511  consistent gets
          0  physical reads
          0  redo size
        816  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          2  rows processed

复制


所以我们可以实现要求,仍然使用联接。

需要注意的另一件事是,真正的 “源” 与 “目标” 比较的一大成本是,如果它们在不同的数据库上,那么我们必须在网络上拖动这些表中的一个。在这种情况下,您可能需要考虑一个潜在的折衷方案,即我们自己进行一些散列,因此我们仅将hashkey拖到电线上,例如

SQL>
SQL> select * from
  2    ( select rowid t1rid, ora_hash(t1.owner||t1.object_name||t1.object_id) t1colhash from t1) t1
  3    full outer join
  4    ( select rowid t2rid, ora_hash(t2.owner||t2.object_name||t2.object_id) t2colhash from t2) t2
  5    on t1.t1colhash = t2.t2colhash
  6  where   t1colhash is null
  7  or t2colhash is null;

T1RID               T1COLHASH T2RID               T2COLHASH
------------------ ---------- ------------------ ----------
                              AAAyZkAAGAAAXOeAA0 2029513192
AAAyZjAAGAAAOCeAA0 1656564078
AAAyZjAAGAAAOCeAAz  583804402

3 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1042339821

------------------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |   109M|  5216M|       |  1275  (23)| 00:00:01 |
|*  1 |  VIEW                 | VW_FOJ_0 |   109M|  5216M|       |  1275  (23)| 00:00:01 |
|*  2 |   HASH JOIN FULL OUTER|          |   109M|  5216M|  3776K|  1275  (23)| 00:00:01 |
|   3 |    VIEW               |          |   104K|  2551K|       |   312   (1)| 00:00:01 |
|   4 |     TABLE ACCESS FULL | T1       |   104K|  5000K|       |   312   (1)| 00:00:01 |
|   5 |    VIEW               |          |   104K|  2551K|       |   312   (1)| 00:00:01 |
|   6 |     TABLE ACCESS FULL | T2       |   104K|  5000K|       |   312   (1)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T1COLHASH" IS NULL OR "T2COLHASH" IS NULL)
   2 - access("T1"."T1COLHASH"="T2"."T2COLHASH")


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       3511  consistent gets
          0  physical reads
          0  redo size
        877  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          3  rows processed

SQL>
SQL>
SQL>
复制


最后,有时 “分层” 方法也许是进行 “足够好” 检查的一种手段。例如,您可能会做一些由所有者雕刻的事情,例如

SQL> select * from
  2    ( select owner t1own, sum(ora_hash(owner||object_name||object_id)) t1colhash , count(*) t1cnt from t1 group by owner)
  3    full outer join
  4    ( select owner t2own, sum(ora_hash(owner||object_name||object_id)) t2colhash , count(*) t2cnt from t2 group by owner)
  5    on t1.t1own = t2.t2own
  6  where t1own is null
  7  or t1own is null
  8  or t1colhash != t2colhash
  9  or t1cnt != t2cnt;

T1OWN
----------------------------------------------------------------------------------------------------
 T1COLHASH      T1CNT
---------- ----------
T2OWN
----------------------------------------------------------------------------------------------------
 T2COLHASH      T2CNT
---------- ----------


SCOTT2
2029513192          1

SCOTT
4.6253E+10         21
SCOTT
4.4013E+10         19


2 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 632540738

----------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |    42 |  7728 |   630   (2)| 00:00:01 |
|*  1 |  VIEW                 | VW_FOJ_0 |    42 |  7728 |   630   (2)| 00:00:01 |
|*  2 |   HASH JOIN FULL OUTER|          |    42 |  7728 |   630   (2)| 00:00:01 |
|   3 |    VIEW               |          |    42 |  3864 |   315   (2)| 00:00:01 |
|   4 |     HASH GROUP BY     |          |    42 |  1554 |   315   (2)| 00:00:01 |
|   5 |      TABLE ACCESS FULL| T1       |   104K|  3775K|   312   (1)| 00:00:01 |
|   6 |    VIEW               |          |    42 |  3864 |   315   (2)| 00:00:01 |
|   7 |     HASH GROUP BY     |          |    42 |  1554 |   315   (2)| 00:00:01 |
|   8 |      TABLE ACCESS FULL| T2       |   104K|  3775K|   312   (1)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T1OWN" IS NULL OR "T1COLHASH"<>"T2COLHASH" OR
              "T1CNT"<>"T2CNT")
   2 - access("T1"."T1OWN"="T2"."T2OWN")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       3510  consistent gets
          0  physical reads
          0  redo size
        984  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          2  rows processed
复制


这意味着网络上的数据最少,但是如果您 * 确实 * 发现差异,则需要进一步的工作。

希望这有所帮助。

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论