集群及DG环境:
主库:
host | ip | vip | scan-ip |
---|---|---|---|
prirac1 | 192.168.11.156 | 192.168.11.158 | 192.168.11.160 |
prirac2 | 192.168.11.157 | 192.168.11.159 |
备库:
host | ip | vip | scan-ip |
---|---|---|---|
stdrac1 | 192.168.11.166 | 192.168.11.170 | 192.168.11.172 |
stdrac2 | 192.168.11.169 | 192.168.11.171 |
一、RAC安装及DG搭建
略
二、DG BROKER配置:
1,创建DG BROKER CONFIG FILE存储目录
RAC环境需要将目录建立在共享存储里面,比如ASM。
主库:
[grid@prirac1 ~]$ asmcmd ASMCMD> cd data/ora12c ASMCMD> pwd +data/ora12c ASMCMD> mkdir DATAGUARDCONFIG
复制
备库:
[grid@prirac1 ~]$ asmcmd ASMCMD> cd data/standby ASMCMD> pwd +data/standby ASMCMD> mkdir DATAGUARDCONFIG
复制
2,取消备库应用
备库:
SQL> alter database recover managed standby database cancel;
复制
3,配置主备库DG_BROKER_CONFIG_FILEn及dg_broker_start参数
注:12c需要先清除LOG_ARCHIVE_DEST_2参数,否则报错
主库:
SQL> alter system set LOG_ARCHIVE_DEST_2='' scope=both; SQL> ALTER SYSTEM SET DG_BROKER_CONFIG_FILE1='+DATA/ora12c/DATAGUARDCONFIG/dg_conf1.ora' SCOPE=both sid='*'; SQL> ALTER SYSTEM SET DG_BROKER_CONFIG_FILE2='+DATA/ora12c/DATAGUARDCONFIG/dg_conf2.ora' SCOPE=both sid='*'; SQL> alter system set dg_broker_start=true scope=both sid='*';
复制
备库:
SQL> alter system set LOG_ARCHIVE_DEST_2='' scope=both; SQL> ALTER SYSTEM SET DG_BROKER_CONFIG_FILE1='+DATA/standby/DATAGUARDCONFIG/dg_conf1.ora' SCOPE=both sid='*'; SQL> ALTER SYSTEM SET DG_BROKER_CONFIG_FILE2='+DATA/standby/DATAGUARDCONFIG/dg_conf2.ora' SCOPE=both sid='*'; SQL> alter system set dg_broker_start=true scope=both sid='*';
复制
4,TNSNAMES.ORA配置
tnsnames内容如下,主备各节点一样即可。
primary = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.11.160)(PORT = 1521))) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = ora12c))) primary_all = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.11.158)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.11.159)(PORT = 1521))) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = ora12c))) standby = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.11.172)(PORT = 1521))) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = standby))) standby_all = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.11.170)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.11.171)(PORT = 1521))) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = standby)))
复制
5,DGMGRL配置
主库:
[oracle@prirac1 ~]$ dgmgrl DGMGRL> connect sys Password: Connected to "ora12c" Connected as SYSDG. DGMGRL> CREATE CONFIGURATION dg AS PRIMARY DATABASE IS ora12c CONNECT IDENTIFIER IS primary; DGMGRL> ADD DATABASE standby AS CONNECT IDENTIFIER IS standby; DGMGRL> enable configuration; DGMGRL> show configuration: Configuration - dg Protection Mode: MaxPerformance Members: ora12c - Primary database standby - (*) Physical standby database Fast-Start Failover: DISABLED Configuration Status: SUCCESS (status updated 4 seconds ago)
复制
6,主备切换测试:
DGMGRL> switchover to standby;
复制
二、fast-start failover配置:
1,主备启用flashback database
主库:
SQL> select flashback_on,force_logging from v$database; FLASHBACK_ON FORCE_LOGGING ------------------ --------------------------------------- NO YES SQL> ALTER SYSTEM SET db_recovery_file_dest_size=10g SCOPE=BOTH; System altered. SQL> alter system set db_recovery_file_dest='+data'; System altered. SQL> alter database flashback on; Database altered. SQL> select flashback_on,force_logging from v$database; FLASHBACK_ON FORCE_LOGGING ------------------ --------------------------------------- YES YES SQL> show parameter log_archive_dest_2 NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ log_archive_dest_2 string service="standby", ASYNC NOAFF IRM delay=0 optional compressi on=disable max_failure=0 max_c onnections=1 reopen=300 db_uni que_name="standby" net_timeout =30, valid_for=(online_logfile ,all_roles)
复制
备库:
SQL> ALTER SYSTEM SET db_recovery_file_dest_size=10g SCOPE=BOTH; System altered. SQL> alter system set db_recovery_file_dest='+data'; System altered. SQL> alter database flashback on; alter database flashback on * ERROR at line 1: ORA-01153: an incompatible media recovery is active SQL> alter database recover managed standby database cancel ; Database altered. SQL> alter database flashback on; Database altered. SQL> select flashback_on,force_logging from v$database; FLASHBACK_ON FORCE_LOGGING ------------------ --------------------------------------- YES YES
复制
2,启用fast-start failover
确保broker已经配置,同时运行模式为最大可用模式或者最大性能模式,如果数据库运行模式为最大可用模式,确保参数LogXptMode 配置为SYNC,如果是最大性能模式,则参数LogXptMode 应该为ASYNC
DGMGRL> show configuration; Configuration - dg Protection Mode: MaxPerformance Members: ora12c - Primary database standby - (*) Physical standby database Fast-Start Failover: DISABLED Configuration Status: SUCCESS (status updated 9 seconds ago) DGMGRL> show resource verbose ora12c logxptmode on site ora12c LogXptMode = 'ASYNC' DGMGRL> show resource verbose standby logxptmode on site standby LogXptMode = 'ASYNC'
复制
附:
1,如果想改为最大可用模式: 2,如果有多个物理备库,需要指定自动failover目标(本次测试只有一主一备,不需要修改) |
启动fsfo:
DGMGRL> enable fast_start failover; DGMGRL> show configuration verbose Configuration - dg Protection Mode: MaxPerformance Members: ora12c - Primary database standby - (*) Physical standby database (*) Fast-Start Failover target Properties: FastStartFailoverThreshold = '30' OperationTimeout = '30' TraceLevel = 'USER' FastStartFailoverLagLimit = '30' CommunicationTimeout = '180' ObserverReconnect = '0' FastStartFailoverAutoReinstate = 'TRUE' FastStartFailoverPmyShutdown = 'TRUE' BystandersFollowRoleChange = 'ALL' ObserverOverride = 'FALSE' ExternalDestination1 = '' ExternalDestination2 = '' PrimaryLostWriteAction = 'CONTINUE' ConfigurationWideServiceName = 'ora12c_CFG' Fast-Start Failover: ENABLED Threshold: 30 seconds Target: standby Observer: prirac1 Lag Limit: 30 seconds Shutdown Primary: TRUE Auto-reinstate: TRUE Observer Reconnect: (none) Observer Override: FALSE Configuration Status: SUCCESS
复制
2,启动observer
因为observer窗口会一直挂起,可用如下方式后台运行
nohup dgmgrl sys/oracle@db2 "start observer">>observer.log 2>&1 &
复制
本次测试直接新开一个窗口启动observer:
DGMGRL> start observer [W000 10/13 22:20:26.86] FSFO target standby is standby [W000 10/13 22:20:30.57] Observer trace level is set to USER [W000 10/13 22:20:30.57] Try to connect to the primary. [W000 10/13 22:20:30.57] Try to connect to the primary primary. [W000 10/13 22:20:30.59] The standby standby is ready to be a FSFO target [W000 10/13 22:20:32.59] Connection to the primary restored! [W000 10/13 22:20:34.60] Disconnecting from database primary. DGMGRL> show configuration; Configuration - dg Protection Mode: MaxPerformance Members: ora12c - Primary database standby - (*) Physical standby database Fast-Start Failover: ENABLED Configuration Status: SUCCESS (status updated 4 seconds ago)
复制
至此fast-start failover配置完成。
3,FSFO测试
abort关闭主库:
[oracle@prirac1 ~]$ srvctl stop database -d ora12c -stopoption abort
复制
OBSERVER 日志:
[W000 10/13 22:39:24.66] Primary database cannot be reached. [W000 10/13 22:39:24.66] Fast-Start Failover threshold has not exceeded. Retry for the next 30 seconds [W000 10/13 22:39:25.66] Try to connect to the primary. Unable to connect to database using primary ORA-12514: TNS:listener does not currently know of service requested in connect descriptor [W000 10/13 22:39:25.68] Primary database cannot be reached. [W000 10/13 22:39:26.68] Try to connect to the primary. Unable to connect to database using primary ORA-12514: TNS:listener does not currently know of service requested in connect descriptor [W000 10/13 22:39:26.69] Primary database cannot be reached. [W000 10/13 22:39:27.69] Try to connect to the primary. Unable to connect to database using primary ORA-12514: TNS:listener does not currently know of service requested in connect descriptor [W000 10/13 22:39:27.69] Primary database cannot be reached. [W000 10/13 22:39:28.70] Try to connect to the primary. Unable to connect to database using primary ORA-12514: TNS:listener does not currently know of service requested in connect descriptor [W000 10/13 22:39:51.79] Primary database cannot be reached. [W000 10/13 22:39:51.79] Fast-Start Failover threshold has not exceeded. Retry for the next 3 seconds [W000 10/13 22:39:52.79] Try to connect to the primary. Unable to connect to database using primary ORA-12514: TNS:listener does not currently know of service requested in connect descriptor [W000 10/13 22:39:52.80] Primary database cannot be reached. [W000 10/13 22:39:52.80] Fast-Start Failover threshold has not exceeded. Retry for the next 2 seconds [W000 10/13 22:39:53.80] Try to connect to the primary. Unable to connect to database using primary ORA-12514: TNS:listener does not currently know of service requested in connect descriptor [W000 10/13 22:39:53.80] Primary database cannot be reached. [W000 10/13 22:39:53.80] Fast-Start Failover threshold has not exceeded. Retry for the next 1 second [W000 10/13 22:39:54.81] Try to connect to the primary. Unable to connect to database using primary ORA-12514: TNS:listener does not currently know of service requested in connect descriptor [W000 10/13 22:39:54.81] Primary database cannot be reached. [W000 10/13 22:39:54.81] Fast-Start Failover threshold has expired. [W000 10/13 22:39:54.81] Try to connect to the standby. [W000 10/13 22:39:54.81] Making a last connection attempt to primary database before proceeding with Fast-Start Failover. [W000 10/13 22:39:54.81] Check if the standby is ready for failover. Unable to connect to database using primary ORA-12514: TNS:listener does not currently know of service requested in connect descriptor [S002 10/13 22:39:54.83] Fast-Start Failover started... 22:39:54.84 Tuesday, October 13, 2020 Initiating Fast-Start Failover to database "standby"... [S002 10/13 22:39:54.84] Initiating Fast-start Failover. Performing failover NOW, please wait... Failover succeeded, new primary is "standby" 22:40:09.79 Tuesday, October 13, 2020 [S002 10/13 22:40:09.79] Fast-Start Failover finished... [W000 10/13 22:40:09.79] Failover succeeded. Restart pinging. [W000 10/13 22:40:10.15] Primary database has changed to standby. [W000 10/13 22:40:10.19] Try to connect to the primary. [W000 10/13 22:40:10.19] Try to connect to the primary standby. [W000 10/13 22:40:12.57] The standby ora12c needs to be reinstated [W000 10/13 22:40:12.57] Try to connect to the new standby ora12c. Unable to connect to database using primary ORA-12514: TNS:listener does not currently know of service requested in connect descriptor [W000 10/13 22:40:13.57] Connection to the primary restored! [W000 10/13 22:40:15.58] Disconnecting from database standby. [W000 10/13 22:40:42.66] Try to connect to the new standby ora12c. Unable to connect to database using primary ORA-12514: TNS:listener does not currently know of service requested in connect descriptor [W000 10/13 22:41:12.73] Try to connect to the new standby ora12c. Unable to connect to database using primary ORA-12514: TNS:listener does not currently know of service requested in connect descriptor [W000 10/13 22:41:42.79] Try to connect to the new standby ora12c. Unable to connect to database using primary ORA-12514: TNS:listener does not currently know of service requested in connect descriptor [W000 10/13 22:42:12.86] Try to connect to the new standby ora12c. Unable to connect to database using primary ORA-12528: TNS:listener: all appropriate instances are blocking new connections
复制
主备库验证:
[oracle@stdrac1 ~]$ sqlplus / as sysdba SQL*Plus: Release 12.2.0.1.0 Production on Tue Oct 13 22:54:23 2020 Copyright (c) 1982, 2016, Oracle. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production SQL> select open_mode,database_role from v$database; OPEN_MODE DATABASE_ROLE -------------------- ---------------- READ WRITE PRIMARY
复制
原备库已经切换成新的主库。
启动原主库并验证:
[oracle@prirac1 ~]$ srvctl start database -d ora12c [oracle@prirac1 ~]$ sqlplus / as sysdba SQL*Plus: Release 12.2.0.1.0 Production on Tue Oct 13 22:43:21 2020 Copyright (c) 1982, 2016, Oracle. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production SQL> select open_mode,database_role from v$database; OPEN_MODE DATABASE_ROLE -------------------- ---------------- READ ONLY WITH APPLY PHYSICAL STANDBY
复制
可以看到原来的主库已经变成备库。
observer日志:
[W000 10/13 22:42:42.94] Try to connect to the new standby ora12c. [W000 10/13 22:42:45.94] Connection to the new standby restored! [W000 10/13 22:42:47.95] Try to connect to the primary standby. [W000 10/13 22:42:49.96] Connection to the primary restored! [W000 10/13 22:42:50.96] Wait for new primary to be ready to reinstate. [W000 10/13 22:42:51.96] New primary is now ready to reinstate. [W000 10/13 22:42:51.96] Issuing REINSTATE command. 22:42:51.96 Tuesday, October 13, 2020 Initiating reinstatement for database "ora12c"... Reinstating database "ora12c", please wait... [W000 10/13 22:43:18.06] The standby ora12c is ready to be a FSFO target Reinstatement of database "ora12c" succeeded 22:43:26.40 Tuesday, October 13, 2020 [W000 10/13 22:43:27.08] Successfully reinstated database ora12c. [W000 10/13 22:43:27.09] The reinstatement of standby ora12c was just done
复制
切换回原来的主备角色:
[oracle@stdrac1 admin]$ srvctl stop database -d ora12c -stopoption abort [oracle@stdrac1 admin]$ srvctl start database -d ora12c
复制
三、配置客户端TAF:
1,主库配置TAF service:
SQL> begin 2 DBMS_SERVICE.CREATE_SERVICE(service_name => 'dgtaf.com', 3 network_name => 'dgtaf.com', 4 aq_ha_notifications => TRUE, 5 failover_method => 'BASIC', 6 failover_type => 'SELECT', 7 failover_retries => 30, 8 failover_delay => 5); 9 end; 10 / PL/SQL procedure successfully completed.
复制
2,创建存储过程来调用taf服务,并确保只会在主库调用
SQL> create or replace procedure proc_dgtaf is 2 v_role VARCHAR(30); 3 begin 4 select DATABASE_ROLE into v_role from V$DATABASE; 5 if v_role = 'PRIMARY' then 6 DBMS_SERVICE.START_SERVICE('dgtaf.com'); 7 else 8 DBMS_SERVICE.STOP_SERVICE('dgtaf.com'); 9 end if; 10 end; 11 / Procedure created.
复制
3,创建触发器,用于在数据库启动或者角色切换的时候,调用上面的存储过程:
SQL> create or replace TRIGGER trg_dgtaf 2 after startup or db_role_change on database 3 begin 4 proc_dgtaf; 5 end; 6 / Trigger created.
复制
4,在主库手动执行存储过程,启动taf服务(也可以通过重启数据库,通过触发器来执行存储过程,这里手动执行一下):
SQL> exec proc_dgtaf; PL/SQL procedure successfully completed.
复制
5,备库确认刚创建的对象已经同步:
SQL> select trigger_name from dba_triggers where trigger_name = 'TRG_DGTAF'; TRIGGER_NAME ------------------------------------ TRG_DGTAF SQL> select owner,object_name from dba_objects where object_name = 'PROC_DGTAF'; OWNER OBJECT_NAME ----- ------------------------------ SYS PROC_DGTAF
复制
6,配置客户端TNSNAMES:
dgtaf = (DESCRIPTION = (ADDRESS = (PROTOCOL = tcp)(HOST = 192.168.11.160)(PORT = 1521)) (ADDRESS = (PROTOCOL = tcp)(HOST = 192.168.11.172)(PORT = 1521)) (LOAD_BALANCE = yes) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = dgtaf.com) (FAILOVER_MODE = (TYPE = session) (METHOD = basic) (RETRIES = 180) (DELAY = 5) ) ) )
复制
7,查询测试(本例用python简单测试一下)
import cx_Oracle import time db = cx_Oracle.connect('devin','devin','dgtaf') cursor = db.cursor() for i in range(10000000): cursor.execute('''select sysdate from dual''') result=cursor.fetchall() print(result) time.sleep(1)
复制
运行过程中abort关闭主库,运行结果如下图,程序会夯住40多秒,之后继续正常运行,确认主备切换正常,服务正常:
注:故障转移只适用于空闲session和select语句,DML,DDL等会报错:ORA-25408
如图(insert测试):
相关解释:
ORA-25408 One may receive ORA-25408: can not safely replay call when using failover. This will depend on the failover type and on what you are executing. AF only works for idle sessions and SELECT statements. The following operations will give an error (user program must restart the operation after fail-over): PL/SQL program units - stored procedures, functions, packages DML - INSERT, UPDATE, DELETE, SELECT … FOR UPDATE DDL - CREATE, ALTER, DROP, TRUNCATE, GRANT, REVOKE, etc. For example if you are doing an insert and the database goes down and failover occurs, “ORA-25408 can not safely replay call” is expected and the application should handle this exception and re-execute the insert. If you are in a transaction and failover occurs you will also receive an error message and you have to handle this and issue a rollback. In case you have configured failover to be of type SESSION, you will not able able to recover your selects either and you will need to replay them.
复制
评论
