暂无图片
暂无图片
暂无图片
暂无图片
2
暂无图片

12c rac-rac dg broker和fast-failover及客户端TAF配置详细步骤

原创 肖杰 2020-10-14
2721

集群及DG环境:

主库:

host ip vip scan-ip
prirac1 192.168.11.156 192.168.11.158 192.168.11.160
prirac2 192.168.11.157 192.168.11.159

备库:

host ip vip scan-ip
stdrac1 192.168.11.166 192.168.11.170 192.168.11.172
stdrac2 192.168.11.169 192.168.11.171

一、RAC安装及DG搭建

二、DG BROKER配置:

1,创建DG BROKER CONFIG FILE存储目录

RAC环境需要将目录建立在共享存储里面,比如ASM。
主库:

[grid@prirac1 ~]$ asmcmd
ASMCMD> cd data/ora12c
ASMCMD> pwd
+data/ora12c
ASMCMD> mkdir DATAGUARDCONFIG

复制

备库:

[grid@prirac1 ~]$ asmcmd
ASMCMD> cd data/standby
ASMCMD> pwd
+data/standby
ASMCMD> mkdir DATAGUARDCONFIG
复制
2,取消备库应用

备库:

SQL> alter database recover managed standby database cancel;
复制
3,配置主备库DG_BROKER_CONFIG_FILEn及dg_broker_start参数

注:12c需要先清除LOG_ARCHIVE_DEST_2参数,否则报错
主库:

SQL> alter system set LOG_ARCHIVE_DEST_2='' scope=both;
SQL> ALTER SYSTEM SET DG_BROKER_CONFIG_FILE1='+DATA/ora12c/DATAGUARDCONFIG/dg_conf1.ora' SCOPE=both sid='*';
SQL> ALTER SYSTEM SET DG_BROKER_CONFIG_FILE2='+DATA/ora12c/DATAGUARDCONFIG/dg_conf2.ora' SCOPE=both sid='*';
SQL> alter system set dg_broker_start=true scope=both sid='*';
复制

备库:

SQL> alter system set LOG_ARCHIVE_DEST_2='' scope=both;
SQL> ALTER SYSTEM SET DG_BROKER_CONFIG_FILE1='+DATA/standby/DATAGUARDCONFIG/dg_conf1.ora' SCOPE=both sid='*';
SQL> ALTER SYSTEM SET DG_BROKER_CONFIG_FILE2='+DATA/standby/DATAGUARDCONFIG/dg_conf2.ora' SCOPE=both sid='*';
SQL> alter system set dg_broker_start=true scope=both sid='*';
复制
4,TNSNAMES.ORA配置

tnsnames内容如下,主备各节点一样即可。

primary =
   (DESCRIPTION =
     (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.11.160)(PORT = 1521)))
     (CONNECT_DATA =
        (SERVER = DEDICATED)
        (SERVICE_NAME = ora12c)))

primary_all =
   (DESCRIPTION =
     (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.11.158)(PORT = 1521))
        (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.11.159)(PORT = 1521)))
     (CONNECT_DATA =
        (SERVER = DEDICATED)
        (SERVICE_NAME = ora12c)))

standby =
   (DESCRIPTION =
     (ADDRESS_LIST =
       (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.11.172)(PORT = 1521)))
     (CONNECT_DATA =
        (SERVER = DEDICATED)
        (SERVICE_NAME = standby)))

standby_all = 
  (DESCRIPTION =
     (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.11.170)(PORT = 1521))
        (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.11.171)(PORT = 1521)))
     (CONNECT_DATA =
        (SERVER = DEDICATED)
        (SERVICE_NAME = standby)))
复制
5,DGMGRL配置

主库:

[oracle@prirac1 ~]$ dgmgrl
DGMGRL> connect sys
Password:
Connected to "ora12c"
Connected as SYSDG.

DGMGRL> CREATE CONFIGURATION dg AS PRIMARY DATABASE IS ora12c CONNECT IDENTIFIER IS primary;
DGMGRL> ADD DATABASE standby AS CONNECT IDENTIFIER IS standby;
DGMGRL> enable configuration;
DGMGRL> show configuration:
Configuration - dg

  Protection Mode: MaxPerformance
  Members:
  ora12c  - Primary database
    standby - (*) Physical standby database 

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 4 seconds ago)
复制
6,主备切换测试:
DGMGRL> switchover to standby;
复制

二、fast-start failover配置:

1,主备启用flashback database

主库:

SQL> select flashback_on,force_logging from v$database; 
FLASHBACK_ON       FORCE_LOGGING
------------------ ---------------------------------------
NO                YES

SQL> ALTER SYSTEM SET db_recovery_file_dest_size=10g SCOPE=BOTH; 
System altered.

SQL> alter system set  db_recovery_file_dest='+data'; 
System altered.

SQL>  alter database flashback on; 
Database altered.

SQL> select flashback_on,force_logging from v$database; 
FLASHBACK_ON       FORCE_LOGGING
------------------ ---------------------------------------
YES                YES

SQL> show parameter log_archive_dest_2
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest_2                   string      service="standby", ASYNC NOAFF
                                                 IRM delay=0 optional compressi
                                                 on=disable max_failure=0 max_c
                                                 onnections=1 reopen=300 db_uni
                                                 que_name="standby" net_timeout
                                                 =30, valid_for=(online_logfile
                                                 ,all_roles)
复制

备库:

SQL> ALTER SYSTEM SET db_recovery_file_dest_size=10g SCOPE=BOTH; 
System altered.

SQL> alter system set  db_recovery_file_dest='+data'; 
System altered.

SQL>  alter database flashback on; 
 alter database flashback on
*
ERROR at line 1:
ORA-01153: an incompatible media recovery is active


SQL> alter database recover managed standby database cancel ;
Database altered.

SQL>  alter database flashback on; 
Database altered.

SQL> select flashback_on,force_logging from v$database; 
FLASHBACK_ON       FORCE_LOGGING
------------------ ---------------------------------------
YES                YES
复制
2,启用fast-start failover

确保broker已经配置,同时运行模式为最大可用模式或者最大性能模式,如果数据库运行模式为最大可用模式,确保参数LogXptMode 配置为SYNC,如果是最大性能模式,则参数LogXptMode 应该为ASYNC

DGMGRL> show configuration;
Configuration - dg

  Protection Mode: MaxPerformance
  Members:
  ora12c  - Primary database
    standby - (*) Physical standby database 

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 9 seconds ago)

DGMGRL> show resource verbose ora12c logxptmode on site ora12c
  LogXptMode = 'ASYNC'
DGMGRL> show resource verbose standby logxptmode on site standby
  LogXptMode = 'ASYNC'
复制
附:

1,如果想改为最大可用模式:
DGMGRL> alter resource ora12c set property logxptmode=‘ASYNC’;
Property “logxptmode” updated
DGMGRL> alter resource standby set property logxptmode=‘ASYNC’;
Property “logxptmode” updated
DGMGRL> edit configuration set protection mode as maxavailability;

2,如果有多个物理备库,需要指定自动failover目标(本次测试只有一主一备,不需要修改)
DGMGRL> edit database ‘ora12c’ set property ‘FastStartFailoverTarget’=‘standby’;
DGMGRL> edit database ‘standby’ set property ‘FastStartFailoverTarget’=‘ora12c’;

启动fsfo:

DGMGRL> enable fast_start failover; 
DGMGRL> show configuration verbose

Configuration - dg

  Protection Mode: MaxPerformance
  Members:
  ora12c  - Primary database
    standby - (*) Physical standby database 

  (*) Fast-Start Failover target

  Properties:
    FastStartFailoverThreshold      = '30'
    OperationTimeout                = '30'
    TraceLevel                      = 'USER'
    FastStartFailoverLagLimit       = '30'
    CommunicationTimeout            = '180'
    ObserverReconnect               = '0'
    FastStartFailoverAutoReinstate  = 'TRUE'
    FastStartFailoverPmyShutdown    = 'TRUE'
    BystandersFollowRoleChange      = 'ALL'
    ObserverOverride                = 'FALSE'
    ExternalDestination1            = ''
    ExternalDestination2            = ''
    PrimaryLostWriteAction          = 'CONTINUE'
    ConfigurationWideServiceName    = 'ora12c_CFG'

Fast-Start Failover: ENABLED

  Threshold:          30 seconds
  Target:             standby
  Observer:           prirac1
  Lag Limit:          30 seconds
  Shutdown Primary:   TRUE
  Auto-reinstate:     TRUE
  Observer Reconnect: (none)
  Observer Override:  FALSE

Configuration Status:
SUCCESS
复制
2,启动observer

因为observer窗口会一直挂起,可用如下方式后台运行

nohup dgmgrl  sys/oracle@db2 "start observer">>observer.log 2>&1 & 
复制

本次测试直接新开一个窗口启动observer:

DGMGRL> start observer
[W000 10/13 22:20:26.86] FSFO target standby is standby
[W000 10/13 22:20:30.57] Observer trace level is set to USER
[W000 10/13 22:20:30.57] Try to connect to the primary.
[W000 10/13 22:20:30.57] Try to connect to the primary primary.
[W000 10/13 22:20:30.59] The standby standby is ready to be a FSFO target
[W000 10/13 22:20:32.59] Connection to the primary restored!
[W000 10/13 22:20:34.60] Disconnecting from database primary.
DGMGRL> show configuration;

Configuration - dg

  Protection Mode: MaxPerformance
  Members:
  ora12c  - Primary database
    standby - (*) Physical standby database 

Fast-Start Failover: ENABLED

Configuration Status:
SUCCESS   (status updated 4 seconds ago)
复制

至此fast-start failover配置完成。

3,FSFO测试

abort关闭主库:

[oracle@prirac1 ~]$ srvctl stop  database -d ora12c -stopoption abort
复制

OBSERVER 日志:

[W000 10/13 22:39:24.66] Primary database cannot be reached.
[W000 10/13 22:39:24.66] Fast-Start Failover threshold has not exceeded. Retry for the next 30 seconds
[W000 10/13 22:39:25.66] Try to connect to the primary.
Unable to connect to database using primary
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

[W000 10/13 22:39:25.68] Primary database cannot be reached.
[W000 10/13 22:39:26.68] Try to connect to the primary.
Unable to connect to database using primary
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

[W000 10/13 22:39:26.69] Primary database cannot be reached.
[W000 10/13 22:39:27.69] Try to connect to the primary.
Unable to connect to database using primary
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

[W000 10/13 22:39:27.69] Primary database cannot be reached.
[W000 10/13 22:39:28.70] Try to connect to the primary.
Unable to connect to database using primary
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

[W000 10/13 22:39:51.79] Primary database cannot be reached.
[W000 10/13 22:39:51.79] Fast-Start Failover threshold has not exceeded. Retry for the next 3 seconds
[W000 10/13 22:39:52.79] Try to connect to the primary.
Unable to connect to database using primary
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

[W000 10/13 22:39:52.80] Primary database cannot be reached.
[W000 10/13 22:39:52.80] Fast-Start Failover threshold has not exceeded. Retry for the next 2 seconds
[W000 10/13 22:39:53.80] Try to connect to the primary.
Unable to connect to database using primary
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

[W000 10/13 22:39:53.80] Primary database cannot be reached.
[W000 10/13 22:39:53.80] Fast-Start Failover threshold has not exceeded. Retry for the next 1 second
[W000 10/13 22:39:54.81] Try to connect to the primary.
Unable to connect to database using primary
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

[W000 10/13 22:39:54.81] Primary database cannot be reached.
[W000 10/13 22:39:54.81] Fast-Start Failover threshold has expired.
[W000 10/13 22:39:54.81] Try to connect to the standby.
[W000 10/13 22:39:54.81] Making a last connection attempt to primary database before proceeding with Fast-Start Failover.
[W000 10/13 22:39:54.81] Check if the standby is ready for failover.
Unable to connect to database using primary
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

[S002 10/13 22:39:54.83] Fast-Start Failover started...

22:39:54.84  Tuesday, October 13, 2020
Initiating Fast-Start Failover to database "standby"...
[S002 10/13 22:39:54.84] Initiating Fast-start Failover.
Performing failover NOW, please wait...
Failover succeeded, new primary is "standby"
22:40:09.79  Tuesday, October 13, 2020
[S002 10/13 22:40:09.79] Fast-Start Failover finished...
[W000 10/13 22:40:09.79] Failover succeeded. Restart pinging.
[W000 10/13 22:40:10.15] Primary database has changed to standby.
[W000 10/13 22:40:10.19] Try to connect to the primary.
[W000 10/13 22:40:10.19] Try to connect to the primary standby.
[W000 10/13 22:40:12.57] The standby ora12c needs to be reinstated
[W000 10/13 22:40:12.57] Try to connect to the new standby ora12c.
Unable to connect to database using primary
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

[W000 10/13 22:40:13.57] Connection to the primary restored!
[W000 10/13 22:40:15.58] Disconnecting from database standby.
[W000 10/13 22:40:42.66] Try to connect to the new standby ora12c.
Unable to connect to database using primary
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

[W000 10/13 22:41:12.73] Try to connect to the new standby ora12c.
Unable to connect to database using primary
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

[W000 10/13 22:41:42.79] Try to connect to the new standby ora12c.
Unable to connect to database using primary
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

[W000 10/13 22:42:12.86] Try to connect to the new standby ora12c.
Unable to connect to database using primary
ORA-12528: TNS:listener: all appropriate instances are blocking new connections
复制

主备库验证:

[oracle@stdrac1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 12.2.0.1.0 Production on Tue Oct 13 22:54:23 2020
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> select open_mode,database_role from v$database;
OPEN_MODE            DATABASE_ROLE
-------------------- ----------------
READ WRITE           PRIMARY
复制

原备库已经切换成新的主库。

启动原主库并验证:

[oracle@prirac1 ~]$ srvctl start database -d ora12c

[oracle@prirac1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 12.2.0.1.0 Production on Tue Oct 13 22:43:21 2020
Copyright (c) 1982, 2016, Oracle.  All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
SQL> select open_mode,database_role from v$database;

OPEN_MODE            DATABASE_ROLE
-------------------- ----------------
READ ONLY WITH APPLY PHYSICAL STANDBY
复制

可以看到原来的主库已经变成备库。

observer日志:

[W000 10/13 22:42:42.94] Try to connect to the new standby ora12c.
[W000 10/13 22:42:45.94] Connection to the new standby restored!
[W000 10/13 22:42:47.95] Try to connect to the primary standby.
[W000 10/13 22:42:49.96] Connection to the primary restored!
[W000 10/13 22:42:50.96] Wait for new primary to be ready to reinstate.
[W000 10/13 22:42:51.96] New primary is now ready to reinstate.
[W000 10/13 22:42:51.96] Issuing REINSTATE command.

22:42:51.96  Tuesday, October 13, 2020
Initiating reinstatement for database "ora12c"...
Reinstating database "ora12c", please wait...
[W000 10/13 22:43:18.06] The standby ora12c is ready to be a FSFO target
Reinstatement of database "ora12c" succeeded
22:43:26.40  Tuesday, October 13, 2020
[W000 10/13 22:43:27.08] Successfully reinstated database ora12c.
[W000 10/13 22:43:27.09] The reinstatement of standby ora12c was just done
复制

切换回原来的主备角色:

[oracle@stdrac1 admin]$ srvctl stop database -d ora12c -stopoption abort
[oracle@stdrac1 admin]$ srvctl start database -d ora12c
复制

三、配置客户端TAF:

1,主库配置TAF service:

SQL> begin
  2  DBMS_SERVICE.CREATE_SERVICE(service_name => 'dgtaf.com',
  3  network_name        => 'dgtaf.com',
  4  aq_ha_notifications => TRUE,
  5  failover_method     => 'BASIC',
  6  failover_type       => 'SELECT',
  7  failover_retries    => 30,
  8  failover_delay      => 5);
  9  end;
 10  /
PL/SQL procedure successfully completed.
复制

2,创建存储过程来调用taf服务,并确保只会在主库调用

SQL> create or replace procedure proc_dgtaf is
  2  v_role VARCHAR(30);
  3  begin
  4     select DATABASE_ROLE into v_role from V$DATABASE;
  5     if v_role = 'PRIMARY' then
  6             DBMS_SERVICE.START_SERVICE('dgtaf.com');
  7     else
  8             DBMS_SERVICE.STOP_SERVICE('dgtaf.com');
  9     end if;
 10  end;
 11  /
Procedure created.
复制

3,创建触发器,用于在数据库启动或者角色切换的时候,调用上面的存储过程:

SQL> create or replace TRIGGER trg_dgtaf
  2  after startup or db_role_change on database
  3  begin
  4    proc_dgtaf;
  5  end;
  6  /
Trigger created.
复制

4,在主库手动执行存储过程,启动taf服务(也可以通过重启数据库,通过触发器来执行存储过程,这里手动执行一下):

SQL> exec proc_dgtaf;
PL/SQL procedure successfully completed.
复制

5,备库确认刚创建的对象已经同步:

SQL> select trigger_name from dba_triggers where trigger_name = 'TRG_DGTAF';
TRIGGER_NAME
------------------------------------
TRG_DGTAF

SQL> select owner,object_name from dba_objects where object_name = 'PROC_DGTAF';
OWNER OBJECT_NAME
----- ------------------------------
SYS   PROC_DGTAF
复制

6,配置客户端TNSNAMES:

dgtaf =
    (DESCRIPTION =
        (ADDRESS = (PROTOCOL = tcp)(HOST = 192.168.11.160)(PORT = 1521))
        (ADDRESS = (PROTOCOL = tcp)(HOST = 192.168.11.172)(PORT = 1521))
            (LOAD_BALANCE = yes)
                (CONNECT_DATA =
                    (SERVER = DEDICATED)
                    (SERVICE_NAME = dgtaf.com)
                (FAILOVER_MODE =
                    (TYPE = session)
                    (METHOD = basic)
                    (RETRIES = 180)
                    (DELAY = 5)
               )
        )
)
复制

7,查询测试(本例用python简单测试一下)

import cx_Oracle
import time
db = cx_Oracle.connect('devin','devin','dgtaf')
cursor = db.cursor()

for i in range(10000000):
    cursor.execute('''select sysdate from dual''')
    result=cursor.fetchall()
    print(result)
    time.sleep(1)
复制

运行过程中abort关闭主库,运行结果如下图,程序会夯住40多秒,之后继续正常运行,确认主备切换正常,服务正常:
1.png

注:故障转移只适用于空闲session和select语句,DML,DDL等会报错:ORA-25408
如图(insert测试):
2.png
相关解释:

ORA-25408
One may receive ORA-25408: can not safely replay call when using failover. This will depend on the failover type and on what you are executing.

AF only works for idle sessions and SELECT statements. The following operations will give an error (user program must restart the operation after fail-over):

    PL/SQL program units - stored procedures, functions, packages
    DML - INSERT, UPDATE, DELETE, SELECT … FOR UPDATE
    DDL - CREATE, ALTER, DROP, TRUNCATE, GRANT, REVOKE, etc.

For example if you are doing an insert and the database goes down and failover occurs, “ORA-25408 can not safely replay call” is expected and the application should handle this exception and re-execute the insert.
If you are in a transaction and failover occurs you will also receive an error message and you have to handle this and issue a rollback.
In case you have configured failover to be of type SESSION, you will not able able to recover your selects either and you will need to replay them.
复制
最后修改时间:2020-10-14 15:37:20
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
1人已赞赏
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

章芋文
暂无图片
4年前
评论
暂无图片 0
😆,第6步有个小错误,LOAD_BALANCE = no
4年前
暂无图片 点赞
1
肖杰
暂无图片
4年前
回复
暂无图片 0
膜拜一下大佬~~~~~
4年前
暂无图片 点赞
回复
目录
  • 集群及DG环境:
    • 主库:
    • 备库:
  • 一、RAC安装及DG搭建
  • 二、DG BROKER配置:
    • 1,创建DG BROKER CONFIG FILE存储目录
    • 2,取消备库应用
    • 3,配置主备库DG_BROKER_CONFIG_FILEn及dg_broker_start参数
    • 4,TNSNAMES.ORA配置
    • 5,DGMGRL配置
    • 6,主备切换测试:
  • 二、fast-start failover配置:
    • 1,主备启用flashback database
    • 2,启用fast-start failover
    • 2,启动observer
    • 3,FSFO测试
  • 三、配置客户端TAF: