暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Oracle 根据上次事件日期检查是否存在的最佳方法

ASKTOM 2020-02-25
235

问题描述

你好,

我试图找出在sql中包含条件的最佳方法,以检查用户在特定日期之前发生的最新事件。

以下是我的表格设置和简要说明
USER_NP => 列出所有用户的父表 (在原始情况下,它将被连接到多个表以构建结果集)
USER_LOGONS_NP => 记录每个用户登录时间的表; 用于存储有限时间的数据
USER_LOGON_HIST_NP => 存储存档登录详细信息的表 (来自USER_LOGONS_NP)

testcase sql背后的要求是生成上次登录时间早于 (或早于) 特定日期的用户列表。

请你指教好吗?

下面是我的测试案例

drop table user_np purge ;  
drop table user_logons_np purge ;  
drop table user_logon_hist_np purge ;  
  
  
create table user_np as select * from all_users ;  
create table user_logons_np as select user_id, logon_date, dbms_random.string('a', 50) pad_data from all_users, (select sysdate - level as logon_date from dual connect by level <= 50) ;  
create table user_logon_hist_np as select user_id, logon_date, dbms_random.string('a', 50) pad_data from all_users, (select sysdate - (level - 51) as logon_date from dual connect by level <= 500) ;  
  
  
drop index uhn_idx1 ;  
drop index uln_idx1 ;  
  
  
create index uhn_idx1 on user_logon_hist_np(user_id, logon_date) ;  
create index uln_idx1 on user_logons_np(user_id, logon_date) ;  
  
  
exec dbms_stats.gather_table_stats(user, 'USER_NP', cascade => true) ;  
exec dbms_stats.gather_table_stats(user, 'USER_LOGONS_NP', cascade => true) ;  
exec dbms_stats.gather_table_stats(user, 'USER_LOGON_HIST_NP', cascade => true) ;  
  
  
explain plan for SELECT   
   *  
FROM  
   user_np  
WHERE  
   EXISTS (  
      SELECT  
         user_id  
      FROM  
         (  
            SELECT  
               user_id,  
               MAX(logon_date)  
            FROM  
               user_logons_np  
            GROUP BY user_id  
            HAVING  
               MAX(logon_date) <= SYSDATE  
            UNION  
            SELECT  
               user_id,  
               MAX(logon_date)  
            FROM  
               user_logon_hist_np  
            GROUP BY user_id  
            HAVING  
               MAX(logon_date) <= SYSDATE  
         ) a  
      WHERE  
         a.user_id = user_np.user_id  
   );  
  
  
Plan hash value: 1460566721  
  
----------------------------------------------------------------------------------  
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |  
----------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT      |          |     1 |    34 |    68   (8)| 00:00:01 |  
|*  1 |  HASH JOIN SEMI       |          |     1 |    34 |    68   (8)| 00:00:01 |  
|   2 |   TABLE ACCESS FULL   | USER_NP  |    32 |   672 |     6   (0)| 00:00:01 |  
|   3 |   VIEW                |          |     4 |    52 |    62   (9)| 00:00:01 |  
|   4 |    SORT UNIQUE        |          |     4 |    48 |    62   (9)| 00:00:01 |  
|   5 |     UNION-ALL         |          |       |       |            |          |  
|*  6 |      FILTER           |          |       |       |            |          |  
|   7 |       HASH GROUP BY   |          |     2 |    24 |     7  (15)| 00:00:01 |  
|   8 |        INDEX FULL SCAN| ULN_IDX1 |  1600 | 19200 |     6   (0)| 00:00:01 |  
|*  9 |      FILTER           |          |       |       |            |          |  
|  10 |       HASH GROUP BY   |          |     2 |    24 |    55   (8)| 00:00:01 |  
|  11 |        INDEX FULL SCAN| UHN_IDX1 | 16000 |   187K|    52   (2)| 00:00:01 |  
----------------------------------------------------------------------------------  
  
Predicate Information (identified by operation id):  
---------------------------------------------------  
  
   1 - access("A"."USER_ID"="USER_NP"."USER_ID")  
   6 - filter(MAX("LOGON_DATE")<=SYSDATE@!)  
   9 - filter(MAX("LOGON_DATE")<=SYSDATE@!)  
     
  
  
explain plan for SELECT  
   *  
FROM  
   user_np usn  
WHERE  
   EXISTS (  
            SELECT  
               user_id,  
               MAX(logon_date)  
            FROM  
               user_logons_np uln  
            WHERE uln.user_id = usn.user_id  
            GROUP BY user_id  
            HAVING  
               MAX(logon_date) <= SYSDATE  
            UNION ALL  
            SELECT  
               user_id,  
               MAX(logon_date)  
            FROM  
               user_logon_hist_np uhp  
            WHERE uhp.user_id = usn.user_id  
            GROUP BY user_id  
            HAVING  
               MAX(logon_date) <= SYSDATE  
   );  
  
  
Plan hash value: 1665332333  
  
------------------------------------------------------------------------------------  
| Id  | Operation               | Name     | Rows  | Bytes | Cost (%CPU)| Time     |  
------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT        |          |     1 |    21 |    86   (0)| 00:00:01 |  
|*  1 |  FILTER                 |          |       |       |            |          |  
|   2 |   TABLE ACCESS FULL     | USER_NP  |    32 |   672 |     6   (0)| 00:00:01 |  
|   3 |   UNION-ALL             |          |       |       |            |          |  
|*  4 |    FILTER               |          |       |       |            |          |  
|   5 |     SORT GROUP BY NOSORT|          |     1 |    12 |     2   (0)| 00:00:01 |  
|*  6 |      INDEX RANGE SCAN   | ULN_IDX1 |    50 |   600 |     2   (0)| 00:00:01 |  
|*  7 |    FILTER               |          |       |       |            |          |  
|   8 |     SORT GROUP BY NOSORT|          |     1 |    12 |     3   (0)| 00:00:01 |  
|*  9 |      INDEX RANGE SCAN   | UHN_IDX1 |   500 |  6000 |     3   (0)| 00:00:01 |  
------------------------------------------------------------------------------------  
  
Predicate Information (identified by operation id):  
---------------------------------------------------  
  
   1 - filter( EXISTS ( (SELECT "USER_ID",MAX("LOGON_DATE") FROM   
              "USER_LOGONS_NP" "ULN" WHERE "ULN"."USER_ID"=:B1 GROUP BY "USER_ID" HAVING   
              MAX("LOGON_DATE")<=SYSDATE@!) UNION ALL  (SELECT   
              "USER_ID",MAX("LOGON_DATE") FROM "USER_LOGON_HIST_NP" "UHP" WHERE   
              "UHP"."USER_ID"=:B2 GROUP BY "USER_ID" HAVING   
              MAX("LOGON_DATE")<=SYSDATE@!)))  
   4 - filter(MAX("LOGON_DATE")<=SYSDATE@!)  
   6 - access("ULN"."USER_ID"=:B1)  
   7 - filter(MAX("LOGON_DATE")<=SYSDATE@!)  
   9 - access("UHP"."USER_ID"=:B1)  
     
explain plan for SELECT  
   *  
FROM  
   user_np usn  
WHERE  
 CASE WHEN EXISTS (  
            SELECT  
               null  
            FROM  
               user_logons_np uln  
            WHERE uln.user_id = usn.user_id  
            HAVING  
               MAX(logon_date) <= SYSDATE ) THEN 1  
      ELSE  
        CASE WHEN EXISTS (  
            SELECT  
               null  
            FROM  
               user_logon_hist_np uhp  
            WHERE uhp.user_id = usn.user_id  
            HAVING  
               MAX(logon_date) <= SYSDATE ) THEN 1  
        ELSE 0 END  
      END = 1 ;  
  
  
Plan hash value: 339077023  
  
--------------------------------------------------------------------------------  
| Id  | Operation           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |  
--------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT    |          |    32 |   672 |    38   (0)| 00:00:01 |  
|*  1 |  FILTER             |          |       |       |            |          |  
|   2 |   TABLE ACCESS FULL | USER_NP  |    32 |   672 |     6   (0)| 00:00:01 |  
|*  3 |   FILTER            |          |       |       |            |          |  
|   4 |    SORT AGGREGATE   |          |     1 |    12 |            |          |  
|*  5 |     INDEX RANGE SCAN| ULN_IDX1 |    50 |   600 |     2   (0)| 00:00:01 |  
|*  6 |   FILTER            |          |       |       |            |          |  
|   7 |    SORT AGGREGATE   |          |     1 |    12 |            |          |  
|*  8 |     INDEX RANGE SCAN| UHN_IDX1 |   500 |  6000 |     3   (0)| 00:00:01 |  
--------------------------------------------------------------------------------  
  
Predicate Information (identified by operation id):  
---------------------------------------------------  
  
   1 - filter(CASE  WHEN  EXISTS (SELECT MAX("LOGON_DATE") FROM   
              "USER_LOGONS_NP" "ULN" WHERE "ULN"."USER_ID"=:B1 HAVING   
              MAX("LOGON_DATE")<=SYSDATE@!) THEN 1 ELSE CASE  WHEN  EXISTS (SELECT   
              MAX("LOGON_DATE") FROM "USER_LOGON_HIST_NP" "UHP" WHERE   
              "UHP"."USER_ID"=:B2 HAVING MAX("LOGON_DATE")<=SYSDATE@!) THEN 1 ELSE 0   
              END  END =1)  
   3 - filter(MAX("LOGON_DATE")<=SYSDATE@!)  
   5 - access("ULN"."USER_ID"=:B1)  
   6 - filter(MAX("LOGON_DATE")<=SYSDATE@!)  
   8 - access("UHP"."USER_ID"=:B1)  
复制

专家解答

以上都不是;)

有了有条款,你就

1分组
2过滤

真的你想在where子句中首先过滤。然后分组。

或者在这种情况下,根本不是团体!

对于您的测试数据,以下最适合我:

SELECT *  
FROM   user_np usn  
WHERE  EXISTS (  
  SELECT null
  FROM   user_logons_np uln  
  WHERE uln.user_id = usn.user_id  
  AND   logon_date <= SYSDATE  
  UNION ALL  
  SELECT null
  FROM  user_logon_hist_np uhp  
  WHERE uhp.user_id = usn.user_id  
  AND   logon_date <= SYSDATE  
);  
复制


为了衡量这一点,你应该比较执行计划,而不是解释计划。这包括关键度量,例如每一步处理的行数、缓冲区 (I/O) 完成、花费的时间等。

通过运行:
set serveroutput off
alter session set statistics_level = all;

select 

select * 
from   table(dbms_xplan.display_cursor(null, null, 'IOSTATS LAST'));
复制


这样做,我发现上面最好的替代方案 (通过执行最少数量的缓冲区的操作来衡量) 是最后一个。有这个计划:

------------------------------------------------------------------------------------------    
| Id  | Operation           | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |    
------------------------------------------------------------------------------------------    
|   0 | SELECT STATEMENT    |          |      1 |        |     46 |00:00:00.01 |      97 |    
|*  1 |  FILTER             |          |      1 |        |     46 |00:00:00.01 |      97 |    
|   2 |   TABLE ACCESS FULL | USER_NP  |      1 |     46 |     46 |00:00:00.01 |       3 |    
|*  3 |   FILTER            |          |     46 |        |     46 |00:00:00.01 |      94 |    
|   4 |    SORT AGGREGATE   |          |     46 |      1 |     46 |00:00:00.01 |      94 |    
|*  5 |     INDEX RANGE SCAN| ULN_IDX1 |     46 |     50 |   2300 |00:00:00.01 |      94 |    
|*  6 |   FILTER            |          |      0 |        |      0 |00:00:00.01 |       0 |    
|   7 |    SORT AGGREGATE   |          |      0 |      1 |      0 |00:00:00.01 |       0 |    
|*  8 |     INDEX RANGE SCAN| UHN_IDX1 |      0 |    500 |      0 |00:00:00.01 |       0 |    
------------------------------------------------------------------------------------------ 
复制


注意97个缓冲区 (I/O)

而where子句中的过滤给出:

----------------------------------------------------------------------------------------------    
| Id  | Operation               | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |    
----------------------------------------------------------------------------------------------    
|   0 | SELECT STATEMENT        |          |      1 |        |     46 |00:00:00.02 |      15 |    
|*  1 |  HASH JOIN SEMI         |          |      1 |     46 |     46 |00:00:00.02 |      15 |    
|   2 |   TABLE ACCESS FULL     | USER_NP  |      1 |     46 |     46 |00:00:00.01 |       2 |    
|   3 |   VIEW                  | VW_SQ_1  |      1 |  23042 |   2251 |00:00:00.01 |      13 |    
|   4 |    UNION-ALL            |          |      1 |        |   2251 |00:00:00.01 |      13 |    
|*  5 |     INDEX FAST FULL SCAN| ULN_IDX1 |      1 |   2300 |   2251 |00:00:00.01 |      13 |    
|*  6 |     INDEX FAST FULL SCAN| UHN_IDX1 |      0 |  20742 |      0 |00:00:00.01 |       0 |    
---------------------------------------------------------------------------------------------- 
复制


将近四分之一的工作!只有15个缓冲区,而不是 ~ 100。

也...

因为我希望每个用户都有很多行,所以我还会压缩这些索引的前导列:

alter index uhn_idx1 
  rebuild compress 1;
alter index uln_idx1 
  rebuild compress 1;
复制


这应该会让它们更小 => 阅读效率更高。
文章转载自ASKTOM,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论