暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Oracle 带有映射的PL SQL批量插入

askTom 2017-04-04
255

问题描述

我有一个表Trans,其中包含大约100亿条记录在列 “customer_id” 上具有索引。

表结构: FCT_Trans

customer_id varchar2(20),trans_date日期,trans_type varchar2(10),Cust_name varchar2(100) 等

我还有另一个表dim_accts,其中包含大约600万条记录,并且在 “customer_id” 上还有一个索引

表结构: dim_acct

customer_id varchar2(100),acct_stats_date日期等。

现在,我想考虑基于customer_id的表 “DIM_ACCTS” 中的 “acct_stats_date” 之前的表 “FCT_Trans” 中的最后四个事务作为加联键。

请帮助我设计此逻辑,因为简单的光标花费了太多时间。

请分享一个示例脚本,该脚本可以后缀此类要求


提问的最新情况:

我想提取每个客户的最后四笔交易。

专家解答

所以这里有两个潜在的选择需要考虑。

选项1) 客户有定期活动,因此他们的所有数据都是最近的。

在这种情况下,对事实表进行分区,然后简单地蛮力扫描最近的分区将是最有效的。我不会进一步追求这一点,因为你会做什么相对明显

选项2) 客户有零星的活动。

这是一个更艰难的主张,因为我们不知道需要多少时间跨度来扫描整个表。所以我们可以用不同的方式来解决它。这是我的设置

SQL> create table fct_trans ( customer_id int not null, trans_date date not null, trans_type varchar2(10), Cust_name varchar2(100), numcol int not null);

Table created.

SQL>
SQL> create table dim_acct ( customer_id int not null, acct_stats_date date not null );

Table created.

SQL>
SQL> insert /*+ append */ into fct_trans
  2  select mod(rownum,500),
  3         date '2016-01-01'+
  4             case when mod(rownum,500) < 100 then r
  5                  when mod(rownum,500) < 300 then r*2 + rownum/30000
  6                  else r*3+ rownum/30000
  7             end,
  8         'T', 'CUST', rownum
  9  from
 10  ( select 1 from dual connect by level <= 10000 ),
 11  ( select dbms_random.value(1,200) r from dual connect by level <= 1000 )
 12  /

10000000 rows created.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> insert into dim_acct
  2  select rownum-1,
  3         date '2017-04-01'+ dbms_random.value(1,10)
  4  from dual
  5  connect by level <= 500;

500 rows created.

SQL>
SQL> commit;

Commit complete.

SQL> create index fct_ix on fct_trans ( customer_id, trans_date ) nologging;

Index created.

SQL> create index dim_ix on dim_acct ( customer_id );

Index created.

SQL>
SQL> exec dbms_stats.gather_table_stats('','dim_acct');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats('','fct_trans');

PL/SQL procedure successfully completed.



我们正在使用的逻辑取决于以下内容。让我们找到仅1个客户的最后4个txns

SQL> set autotrace traceonly explain
SQL> select *
  2  from (
  3  select  *
  4  from fct_trans t
  5  where customer_Id = 123
  6  and   trans_date <= date '2017-01-01'
  7  order by trans_date desc
  8  )
  9  where rownum <= 4;

Execution Plan
----------------------------------------------------------
Plan hash value: 4118863517

--------------------------------------------------------------------------------------------
| Id  | Operation                      | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |           |     4 |   376 |     8   (0)| 00:00:01 |
|*  1 |  COUNT STOPKEY                 |           |       |       |            |          |
|   2 |   VIEW                         |           |     4 |   376 |     8   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID | FCT_TRANS |  7838 |   191K|     8   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN DESCENDING| FCT_IX    |     4 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------


我们可以看到这将是有效的,因为我们简单地 “向后走” 向下的索引最后4行,我们就完成了。因此,我们现在需要将其抽象为所有客户的情况。在12c中,我们可以使用横向,但是在此之前,我们需要对cast/multisset进行一些欺骗,以具有相关的视图

SQL> create or replace type fct_obj as object ( customer_id int , trans_date date , trans_type varchar2(10), Cust_name varchar2(100), numcol int )
  2  /

Type created.

SQL>
SQL> create or replace type fct_list as table of fct_obj
  2  /

Type created.

SQL>
SQL> set autotrace traceonly explain
SQL> select *
  2  from dim_acct a,
  3  table(cast(multiset(
  4        select *
  5            from (
  6              select /*+ index_desc(t) first_rows(4) */ *
  7              from fct_trans t
  8              where customer_Id = a.customer_id
  9              and   trans_date <= a.acct_stats_date
 10              order by trans_date desc
 11            )
 12      where rownum <= 4
 13      ) as fct_list )
 14    ) rws;

Execution Plan
----------------------------------------------------------
Plan hash value: 3809679229

------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                 |     4 |    56 |     4   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                            |                 |     4 |    56 |     4   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL                      | DIM_ACCT        |     1 |    12 |     2   (0)| 00:00:01 |
|   3 |   COLLECTION ITERATOR SUBQUERY FETCH     |                 |     4 |     8 |     2   (0)| 00:00:01 |
|*  4 |    COUNT STOPKEY                         |                 |       |       |            |          |
|   5 |     VIEW                                 | VW_LAT_1BBF5C63 |  9888 |   907K| 10449   (1)| 00:00:01 |
|*  6 |      SORT ORDER BY STOPKEY               |                 |  9888 |   241K| 10449   (1)| 00:00:01 |
|   7 |       TABLE ACCESS BY INDEX ROWID BATCHED| FCT_TRANS       |  9888 |   241K| 10448   (1)| 00:00:01 |
|*  8 |        INDEX RANGE SCAN DESCENDING       | FCT_IX          |  9888 |       |    36   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter(ROWNUM<=4)
   6 - filter(ROWNUM<=4)
   8 - access("CUSTOMER_ID"="A"."CUSTOMER_ID" AND "TRANS_DATE"<="A"."ACCT_STATS_DATE")


现在 * 几乎 * 看起来不错,直到我们看到 “排序顺序” 和估计的行,这表明我们不会只得到行,我们会得到很多行,然后对它们进行排序。性能测试确认了大量一致的get

SQL> set autotrace traceonly stat
SQL> select *
  2  from dim_acct a,
  3  table(cast(multiset(
  4        select *
  5            from (
  6              select /*+ index_desc(t) first_rows(4) */ *
  7              from fct_trans t
  8              where customer_Id = a.customer_id
  9              and   trans_date <= a.acct_stats_date
 10              order by trans_date desc
 11            )
 12      where rownum <= 4
 13      ) as fct_list )
 14    ) rws;

1960 rows selected.


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
    6880133  consistent gets
          0  physical reads
          0  redo size
      68088  bytes sent via SQL*Net to client
       2038  bytes received via SQL*Net from client
        132  SQL*Net roundtrips to/from client
        500  sorts (memory)
          0  sorts (disk)
       1960  rows processed


因此,我将使用管道功能来解决此问题。可能还有其他方法。

SQL> create or replace
  2  function last_4_trans(p_cust int, p_stats date) return fct_list pipelined is
  3  begin
  4    for i in (
  5    select * from (
  6      select /*+ index_desc(t) first_rows(4) */ *
  7                from fct_trans t
  8                where customer_Id = p_cust
  9                and   trans_date <= p_stats
 10              order by trans_date desc
 11         ) where rownum <= 4
 12    ) loop
 13       pipe row ( fct_obj(i.customer_id  , i.trans_date  , i.trans_type , i.Cust_name, i.numcol ));
 14     end loop;
 15     return;
 16  end;
 17  /

Function created.

SQL>
SQL> set autotrace traceonly stat
SQL> select *
  2  from dim_acct a,
  3       table(last_4_trans(a.customer_Id,a.acct_stats_date));

1960 rows selected.


Statistics
----------------------------------------------------------
        549  recursive calls
          0  db block gets
       3743  consistent gets
          0  physical reads
          0  redo size
      68088  bytes sent via SQL*Net to client
       2038  bytes received via SQL*Net from client
        132  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       1960  rows processed



「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论