感兴趣的还是记得去看原帖子,笔记有删减,原作者微信公众号:
这是Oracle异常恢复课程第4讲;前面已经大致讲了控制文件、SCN等基本知识,今天我们开始解析Oracle 实例的启动过程。
我们知道对于Oracle而言,数据库启动主要分为3个环节,分别是nomount,mount和open阶段。
首先我们将实例启动到nomount状态,来观察一下相关进程的启动情况:
SQL> startup nomount
ORACLE instance started.
Total System Global Area 2505338880 bytes
Fixed Size 2255832 bytes
Variable Size 620758056 bytes
Database Buffers 1862270976 bytes
Redo Buffers 20054016 bytes
我们来看下此时的数据库后台进程有哪些?
[oracle@test:/home/oracle]$ ps -ef|grep -v grep|grep ora_
oracle 12333 1 0 09:06 ? 00:00:00 ora_pmon_orcl
oracle 12335 1 0 09:06 ? 00:00:00 ora_psp0_orcl
oracle 12341 1 0 09:06 ? 00:00:00 ora_vktm_orcl
oracle 12346 1 0 09:06 ? 00:00:00 ora_gen0_orcl
oracle 12348 1 0 09:06 ? 00:00:00 ora_diag_orcl
oracle 12350 1 0 09:06 ? 00:00:00 ora_dbrm_orcl
oracle 12352 1 0 09:06 ? 00:00:00 ora_dia0_orcl
oracle 12354 1 2 09:06 ? 00:00:01 ora_mman_orcl
oracle 12356 1 0 09:06 ? 00:00:00 ora_dbw0_orcl
oracle 12358 1 0 09:06 ? 00:00:00 ora_lgwr_orcl
oracle 12360 1 0 09:06 ? 00:00:00 ora_ckpt_orcl
oracle 12362 1 0 09:06 ? 00:00:00 ora_smon_orcl
oracle 12364 1 0 09:06 ? 00:00:00 ora_reco_orcl
oracle 12366 1 1 09:06 ? 00:00:00 ora_mmon_orcl
oracle 12368 1 0 09:06 ? 00:00:00 ora_mmnl_orcl
oracle 12370 1 0 09:06 ? 00:00:00 ora_d000_orcl
oracle 12372 1 0 09:06 ? 00:00:00 ora_s000_orcl
oracle 12384 1 0 09:06 ? 00:00:00 ora_arc0_orcl
oracle 12396 1 0 09:06 ? 00:00:00 ora_arc1_orcl
oracle 12398 1 0 09:06 ? 00:00:00 ora_arc2_orcl
oracle 12400 1 0 09:06 ? 00:00:00 ora_arc3_orcl
oracle 12402 1 0 09:06 ? 00:00:00 ora_qmnc_orcl
oracle 12416 1 0 09:06 ? 00:00:00 ora_cjq0_orcl
oracle 12441 1 0 09:06 ? 00:00:00 ora_q000_orcl
oracle 12443 1 0 09:06 ? 00:00:00 ora_q001_orcl
我们如何去判断一个数据库实例的哪些进程是核心进程?所谓的核心进程就是不能kill的,否则你一旦不小心kill,数据库实例就会crash崩溃。可以通过如下脚本来进行查询:
col KSUPRPNM for a20
set linesize 500
select indx,ksuprpnm,to_char(ksuprflg,'xxxxxxx') ksuprflg,KSUPRPID
from x$ksupr
where bitand(ksuprflg,4)=4
order by indx;
INDX KSUPRPNM KSUPRFLG KSUPRPID
---------- -------------------- ---------------- ------------------------------------------------
2 oracle@test (PMON) e 14303
3 oracle@test (PSP0) 6 14305
4 oracle@test (VKTM) 6 14307
5 oracle@test (GEN0) 6 14312
7 oracle@test (DBRM) 6 14316
9 oracle@test (MMAN) 6 14320
10 oracle@test (DBW0) 6 14322
11 oracle@test (LGWR) 6 14324
12 oracle@test (CKPT) 6 14326
13 oracle@test (SMON) 16 14328
10 rows selected.
这里我们要同时准备一些strace 命令,单独开个窗口,准备跟踪一下相关进程的执行情况,或许能发现一些有用的信息:
strace -T -tt -fr -o /tmp/14303.log -p 14303 --pmon
......
strace -T -tt -fr -o /tmp/14322.log -p 14322 --dbw0
接下来我们就准备开始启动数据库实例到mount状态了:
SQL> startup nomount
ORACLE instance started.
Total System Global Area 2505338880 bytes
Fixed Size 2255832 bytes
Variable Size 620758056 bytes
Database Buffers 1862270976 bytes
Redo Buffers 20054016 bytes
SQL> oradebug setmypid
Statement processed.
SQL> oradebug event 10046 trace name context forever,level 12;
Statement processed.
SQL> alter database mount;
Database altered.
SQL> oradebug event 10046 trace name context off
Statement processed.
SQL> oradebug tracefile_name
/home/oracle/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_17403.trc
SQL>
此时我们首先来看下dbw进程的strace log情况:
strace -T -tt -fr -o /tmp/17696.log -p 17696
17696 0.000098 statfs("/oradata/orcl/control02.ctl", {f_type=0x58465342, f_bsize=4096, f_blocks=29353473, f_bfree=7073452, f_bavail=7073452, f_files=57103968, f_ffree=56592187, f_fsid={64768, 0}, f_namelen=255, f_frsize=4096, f_flags=ST_VALID|ST_RELATIME}) = 0 <0.000042>
17696 0.000091 open("/oradata/orcl/control02.ctl", O_RDWR|O_DSYNC) = 11 <0.000041>
17696 0.000091 getrlimit(RLIMIT_NOFILE, {rlim_cur=65535, rlim_max=65535}) = 0 <0.000038>
17696 0.000087 fcntl(11, F_DUPFD, 256) = 257 <0.000039>
17696 0.000082 close(11) = 0 <0.000038>
17696 0.000085 fcntl(257, F_SETFD, FD_CLOEXEC) = 0 <0.000058>
17696 0.000106 fstatfs(257, {f_type=0x58465342, f_bsize=4096, f_blocks=29353473, f_bfree=7073452, f_bavail=7073452, f_files=57103968, f_ffree=56592187, f_fsid={64768, 0}, f_namelen=255, f_frsize=4096, f_flags=ST_VALID|ST_RELATIME}) = 0 <0.000040>
17696 0.000111 pread64(256, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\0046N\0\0\0\0\0\0\0\4 \v@T\260e"..., 16384, 16384) = 16384 <0.000043>
17696 0.000107 pread64(257, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\0046N\0\0\0\0\0\0\0\4 \v@T\260e"..., 16384, 16384) = 16384 <0.000063>
17696 0.000124 pread64(256, "\25\302\0\0\20\0\0\0\201\5\0\0\377\377\1\4z\355\0\0\200B\0\0\0\0\0\0\0\0\0\301"..., 16384, 262144) = 16384 <0.000042>
17696 0.000092 pread64(256, "\25\302\0\0\22\0\0\0\201\5\0\0\377\377\1\4\345\227\0\0\0\0\0\0\0\0\0\0\300\10;F"..., 16384, 294912) = 16384 <0.000040>
17696 0.000093 times({tms_utime=0, tms_stime=0, tms_cutime=0, tms_cstime=0}) = 429732262 <0.000038>
17696 0.000091 pread64(256, "\25\302\0\0\27\0\0\0t\5\0\0\377\377\1\4\236]\0\0\0\0\0\0\0 \0\0\16\0\0\0"..., 16384, 376832) = 16384 <0.000039>
17696 0.000125 pread64(256, "\25\302\0\0\265\0\0\0z\2\0\0\377\377\1\4}\227\0\0\0\n\0\0\0 \0\0\36\0\10\0"..., 16384, 2965504) = 16384 <0.000043>
17696 0.000102 times({tms_utime=0, tms_stime=0, tms_cutime=0, tms_cstime=0}) = 429732262 <0.000037>
17696 0.000100 semctl(24, 23, SETVAL, 0x1) = 0 <0.000049>
17696 0.000119 times({tms_utime=0, tms_stime=0, tms_cutime=0, tms_cstime=0}) = 429732262 <0.000053>
17696 0.000106 semtimedop(24, [{14, -1, 0}], 1, {1, 690000000}) = -1 EAGAIN (Resource temporarily unavailable) <1.689407>
17696 1.689520 getrusage(RUSAGE_SELF, {ru_utime={0, 9393}, ru_stime={0, 9393}, ...}) = 0 <0.000042>
17696 0.000096 getrusage(RUSAGE_SELF, {ru_utime={0, 9422}, ru_stime={0, 9422}, ...}) = 0 <0.000040>
17696 0.000095 times({tms_utime=0, tms_stime=0, tms_cutime=0, tms_cstime=0}) = 429732431 <0.000041>
17696 0.000100 times({tms_utime=0, tms_stime=0, tms_cutime=0, tms_cstime=0}) = 429732431 <0.000041>
17696 0.000098 times({tms_utime=0, tms_stime=0, tms_cutime=0, tms_cstime=0}) = 429732431 <0.000041>
根据上面的strace 内容,我们可以进一步看下相关的文件操作符分别是对应什么文件:
[oracle@test:/home/oracle]$ cd /proc/17696/fd
[oracle@test:/proc/17696/fd]$
[oracle@test:/proc/17696/fd]$ ls -l
total 0
lr-x------ 1 oracle oinstall 64 Sep 5 09:50 0 -> /dev/null
l-wx------ 1 oracle oinstall 64 Sep 5 09:50 1 -> /dev/null
lrwx------ 1 oracle oinstall 64 Sep 5 09:50 10 -> /home/oracle/app/oracle/product/11.2.0/db/dbs/lkORCL
l-wx------ 1 oracle oinstall 64 Sep 5 09:50 2 -> /dev/null
lrwx------ 1 oracle oinstall 64 Sep 5 09:50 256 -> /oradata/orcl/control01.ctl
lrwx------ 1 oracle oinstall 64 Sep 5 09:50 257 -> /oradata/orcl/control02.ctl
lr-x------ 1 oracle oinstall 64 Sep 5 09:50 3 -> /dev/null
lr-x------ 1 oracle oinstall 64 Sep 5 09:50 4 -> /dev/null
lr-x------ 1 oracle oinstall 64 Sep 5 09:50 5 -> /dev/null
lr-x------ 1 oracle oinstall 64 Sep 5 09:50 6 -> /home/oracle/app/oracle/product/11.2.0/db/rdbms/mesg/oraus.msb
lr-x------ 1 oracle oinstall 64 Sep 5 09:50 7 -> /proc/17696/fd
lr-x------ 1 oracle oinstall 64 Sep 5 09:50 8 -> /dev/zero
lrwx------ 1 oracle oinstall 64 Sep 5 09:50 9 -> /home/oracle/app/oracle/product/11.2.0/db/dbs/hc_orcl.dat
[oracle@test:/proc/17696/fd]$
从上面内容看,fd 256/257正是我们的controlfile. 我们将相关的offset转换一下:
17696 0.000111 pread64(256, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\0046N\0\0\0\0\0\0\0\4 \v@T\260e"..., 16384, 16384) = 16384 <0.000043>
17696 0.000107 pread64(257, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\0046N\0\0\0\0\0\0\0\4 \v@T\260e"..., 16384, 16384) = 16384 <0.000063>
17696 0.000124 pread64(256, "\25\302\0\0\20\0\0\0\201\5\0\0\377\377\1\4z\355\0\0\200B\0\0\0\0\0\0\0\0\0\301"..., 16384, 262144) = 16384 <0.000042>
17696 0.000092 pread64(256, "\25\302\0\0\22\0\0\0\201\5\0\0\377\377\1\4\345\227\0\0\0\0\0\0\0\0\0\0\300\10;F"..., 16384, 294912) = 16384 <0.000040>
17696 0.000091 pread64(256, "\25\302\0\0\27\0\0\0t\5\0\0\377\377\1\4\236]\0\0\0\0\0\0\0 \0\0\16\0\0\0"..., 16384, 376832) = 16384 <0.000039>
17696 0.000125 pread64(256, "\25\302\0\0\265\0\0\0z\2\0\0\377\377\1\4}\227\0\0\0\n\0\0\0 \0\0\36\0\10\0"..., 16384, 2965504) = 16384 <0.000043>
从上面来看我们可以得到如下顺序:
1) dbw0进程首先分别读取2个controlfile,顺序分别是control01、control02.读取的是控制文件的头部块,也就是第1个block(block 0是os block,存的操作系统,没oracle数据).
pread64(256, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\0046N\0\0\0\0\0\0\0\4 \v@T\260e"..., 16384, 16384) = 16384
-- 第一个16384表示读取的字节数,第二个16384表示offset
2) 根据后面几次的pread64读取的offset,然后计算下,分别得到如下的block:
select (262144/16384) from dual; --16
select (294912/16384 ) from dual; --18
select (376832/16384) from dual; --23
select (2965504/16384) from dual; --181
首先分别去读control01和control02的第2个block,然后依次读取control01的第16、18、23、181个block
接下来我们看另外一个关键进程lgwr的strace log情况:
[oracle@test:/proc/17696/fd]$ ps -ef | grep lgwr
oracle 20060 1 0 10:07 ? 00:00:00 ora_lgwr_orcl
strace -T -tt -fr -o /tmp/20060.log -p 20060
20060 0.000101 pread64(256, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4\346\234\0\0\0\0\0\0\0\4 \v@T\260e"..., 16384, 16384) = 16384 <0.000045>
20060 0.000110 pread64(257, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4\346\234\0\0\0\0\0\0\0\4 \v@T\260e"..., 16384, 16384) = 16384 <0.000045>
20060 0.000095 pread64(256, "\25\302\0\0\17\0\0\0\210\5\0\0\377\377\1\4\345\255\0\0\0B\0\0\0\0\0\0\0\0\0\301"..., 16384, 245760) = 16384 <0.000043>
20060 0.000099 pread64(256, "\25\302\0\0\21\0\0\0\210\5\0\0\377\377\1\4\346\227\0\0\0\0\0\0\0\0\0\0\300\10;F"..., 16384, 278528) = 16384 <0.000043>
20060 0.000258 times({tms_utime=0, tms_stime=0, tms_cutime=0, tms_cstime=0}) = 429854892 <0.000223>
20060 0.000370 pread64(256, "\25\302\0\0?\1\0\0\202\5\0\0\377\377\1\4\250\34\0\0orcl\0\0\0\0\3\0\0\0"..., 16384, 5226496) = 16384 <0.000104>
20060 0.000175 pread64(256, "\25\302\0\0\23\0\0\0t\5\0\0\377\377\1\4\23{\0\0\16\0\0\0\30\335\22\0\0\0\0\0"..., 16384, 311296) = 16384 <0.000072>
20060 0.000144 pread64(256, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4V\321\0\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 49152) = 16384 <0.000046>
20060 0.000112 times({tms_utime=0, tms_stime=1, tms_cutime=0, tms_cstime=0}) = 429854892 <0.000049>
20060 0.000117 pwrite64(256, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\323\213\0\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 49152) = 16384 <0.000192>
20060 0.000250 pwrite64(257, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\323\213\0\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 49152) = 16384 <0.000132>
20060 0.000197 pread64(256, "\25\302\0\0A\1\0\0q\5\0\0\377\377\1\4u\246\0\0\3\0\0\0\4\0\0\0M\365z\0"..., 16384, 5259264) = 16384 <0.000043>
20060 0.000099 pread64(256, "\25\302\0\0?\1\0\0\202\5\0\0\377\377\1\4\250\34\0\0orcl\0\0\0\0\3\0\0\0"..., 16384, 5226496) = 16384 <0.000040>
20060 0.000101 pwrite64(256, "\25\302\0\0@\1\0\0\211\5\0\0\377\377\1\4\343!\0\0orcl\0\0\0\0\3\0\0\0"..., 16384, 5242880) = 16384 <0.000133>
20060 0.000188 pwrite64(257, "\25\302\0\0@\1\0\0\211\5\0\0\377\377\1\4\343!\0\0orcl\0\0\0\0\3\0\0\0"..., 16384, 5242880) = 16384 <0.000122>
20060 0.000173 pwrite64(256, "\25\302\0\0\22\0\0\0\211\5\0\0\377\377\1\4\345\227\0\0\0\0\0\0\0\0\0\0\300\10;F"..., 16384, 294912) = 16384 <0.000114>
20060 0.000163 pwrite64(257, "\25\302\0\0\22\0\0\0\211\5\0\0\377\377\1\4\345\227\0\0\0\0\0\0\0\0\0\0\300\10;F"..., 16384, 294912) = 16384 <0.000113>
20060 0.000167 pwrite64(256, "\25\302\0\0\20\0\0\0\211\5\0\0\377\377\1\4z\355\0\0\200B\0\0\0\0\0\0\0\0\0\301"..., 16384, 262144) = 16384 <0.000113>
20060 0.000163 pwrite64(257, "\25\302\0\0\20\0\0\0\211\5\0\0\377\377\1\4z\355\0\0\200B\0\0\0\0\0\0\0\0\0\301"..., 16384, 262144) = 16384 <0.000112>
20060 0.000165 pwrite64(256, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4\344\234\0\0\0\0\0\0\0\4 \v@T\260e"..., 16384, 16384) = 16384 <0.000111>
20060 0.000163 pwrite64(257, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4\344\234\0\0\0\0\0\0\0\4 \v@T\260e"..., 16384, 16384) = 16384 <0.000154>
这里我们也该进程下fd的信息列出来:
[oracle@test:/proc/17696/fd]$ cd /proc/20060/fd
[oracle@test:/proc/20060/fd]$ ls -ltr
total 0
lrwx------ 1 oracle oinstall 64 Sep 5 10:09 9 -> /home/oracle/app/oracle/product/11.2.0/db/dbs/hc_orcl.dat
lr-x------ 1 oracle oinstall 64 Sep 5 10:09 8 -> /dev/zero
lr-x------ 1 oracle oinstall 64 Sep 5 10:09 7 -> /proc/20060/fd
lr-x------ 1 oracle oinstall 64 Sep 5 10:09 6 -> /home/oracle/app/oracle/product/11.2.0/db/rdbms/mesg/oraus.msb
lr-x------ 1 oracle oinstall 64 Sep 5 10:09 5 -> /dev/null
lr-x------ 1 oracle oinstall 64 Sep 5 10:09 4 -> /dev/null
lr-x------ 1 oracle oinstall 64 Sep 5 10:09 3 -> /dev/null
lrwx------ 1 oracle oinstall 64 Sep 5 10:09 257 -> /oradata/orcl/control02.ctl
lrwx------ 1 oracle oinstall 64 Sep 5 10:09 256 -> /oradata/orcl/control01.ctl
l-wx------ 1 oracle oinstall 64 Sep 5 10:09 2 -> /dev/null
lrwx------ 1 oracle oinstall 64 Sep 5 10:09 10 -> /home/oracle/app/oracle/product/11.2.0/db/dbs/lkORCL
l-wx------ 1 oracle oinstall 64 Sep 5 10:09 1 -> /dev/null
lr-x------ 1 oracle oinstall 6
从上面信息来看,我们可以得到如下2点:
1) lgwr进程首先读取的是control01、control02. 读取的也是第3个block,首先就读取了一个block.
2) 我们根据上面的几次pread64/pwrite64函数操作的offset进行换算:
pwrite64(256, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\323\213\0\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 49152) = 16384 <0.000192>
pwrite64(257, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\323\213\0\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 49152) = 16384 <0.000132>
我们可以看到在mount的过程中,lgwr是需要写controlfile的第3个block,那么这个block中的内容是什么呢?
lgwr也是检查点相关的一个关键进程。实际上我想很多人都不知道,数据库在mount的过程中,居然lgwr会去写controlfile吧。
最好我们再来看直接相关的一个进程ckpt的情况:
[oracle@test:/home/oracle]$ ps -ef | grep ckpt
oracle 27516 1 0 09:04 ? 00:00:00 ora_ckpt_orcl
strace -T -tt -fr -o /tmp/27516.log -p 27516
从上面ckpt进程的strace log,我们可以得到如下内容:
1) ckpt进程会读取control01、02的前2个block
cat /tmp/27516.log | grep read
22645 0.000142 pread64(256, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4\6>\0\0\0\0\0\0\0\4 \v@T\260e"..., 32768, 16384) = 32768 <0.000078>
22645 0.000133 pread64(257, "\25\302\0\0\1\0\0\0\0\0\0\0\0\0\1\4\6>\0\0\0\0\0\0\0\4 \v@T\260e"..., 32768, 16384) = 32768 <0.000099>
2) ckpt进程会写control01、02的第3个block.
[oracle@test:/proc/27516/fd]$ cat /tmp/27516.log | grep write
27516 0.000316 pwrite64(256, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\342\16\0\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 49152) = 16384 <0.000483>
27516 0.000645 pwrite64(257, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\342\16\0\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 49152) = 16384 <0.000264>
27516 0.000136 pwrite64(256, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\343\16\0\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 49152) = 16384 <0.000281>
27516 0.000384 pwrite64(257, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\343\16\0\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 49152) = 16384 <0.000194>
大家可以strace 跟踪一下相关进程去观察一下。最后面我们来看下10046 trace的情况:
cat /home/oracle/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_17403.trc
alter database mount
END OF STMT
PARSE #46912550761720:c=381,e=381,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=0,tim=1725500746148891
WAIT #46912550761720: nam='rdbms ipc reply' ela= 124 from_process=10 timeout=60 p3=0 obj#=-1 tim=1725500746149211
WAIT #46912550761720: nam='reliable message' ela= 146 channel context=4079187360 channel handle=4079081040 broadcast message=4080200824 obj#=-1 tim=1725500746149669
WAIT #46912550761720: nam='rdbms ipc reply' ela= 252 from_process=12 timeout=900 p3=0 obj#=-1 tim=1725500746150010
WAIT #46912550761720: nam='Disk file operations I/O' ela= 18 FileOperation=2 fileno=0 filetype=1 obj#=-1 tim=1725500746150101
WAIT #46912550761720: nam='Disk file operations I/O' ela= 5 FileOperation=2 fileno=1 filetype=1 obj#=-1 tim=1725500746150114
WAIT #46912550761720: nam='control file sequential read' ela= 8 file#=0 block#=1 blocks=1 obj#=-1 tim=1725500746150129
WAIT #46912550761720: nam='control file sequential read' ela= 3 file#=1 block#=1 blocks=1 obj#=-1 tim=1725500746150151
WAIT #46912550761720: nam='control file sequential read' ela= 46 file#=0 block#=3 blocks=8 obj#=-1 tim=1725500746150212
WAIT #46912550761720: nam='control file sequential read' ela= 40 file#=1 block#=3 blocks=8 obj#=-1 tim=1725500746150261
*** 2024-09-05 09:45:50.150
WAIT #46912550761720: nam='control file heartbeat' ela= 4000308 p1=0 p2=0 p3=0 obj#=-1 tim=1725500750150582
WAIT #46912550761720: nam='control file sequential read' ela= 23 file#=0 block#=3 blocks=8 obj#=-1 tim=1725500750150680
WAIT #46912550761720: nam='control file sequential read' ela= 16 file#=1 block#=3 blocks=8 obj#=-1 tim=1725500750150709
WAIT #46912550761720: nam='control file sequential read' ela= 3 file#=0 block#=1 blocks=1 obj#=-1 tim=1725500750150720
WAIT #46912550761720: nam='control file parallel write' ela= 217 files=1 block#=1 requests=1 obj#=-1 tim=1725500750150947
WAIT #46912550761720: nam='control file sequential read' ela= 4 file#=1 block#=1 blocks=1 obj#=-1 tim=1725500750150968
WAIT #46912550761720: nam='control file parallel write' ela= 150 files=1 block#=1 requests=1 obj#=-1 tim=1725500750151125
WAIT #46912550761720: nam='control file sequential read' ela= 33 file#=0 block#=15 blocks=1 obj#=-1 tim=1725500750151172
WAIT #46912550761720: nam='control file sequential read' ela= 3 file#=0 block#=17 blocks=1 obj#=-1 tim=1725500750151186
WAIT #46912550761720: nam='control file parallel write' ela= 292 files=2 block#=18 requests=2 obj#=-1 tim=1725500750151501
WAIT #46912550761720: nam='control file parallel write' ela= 258 files=2 block#=16 requests=2 obj#=-1 tim=1725500750151779
WAIT #46912550761720: nam='control file parallel write' ela= 278 files=2 block#=1 requests=2 obj#=-1 tim=1725500750152071
WAIT #46912550761720: nam='control file sequential read' ela= 17 file#=0 block#=3 blocks=8 obj#=-1 tim=1725500750152104
WAIT #46912550761720: nam='control file parallel write' ela= 142 files=1 block#=3 requests=1 obj#=-1 tim=1725500750152261
WAIT #46912550761720: nam='control file parallel write' ela= 146 files=1 block#=4 requests=1 obj#=-1 tim=1725500750152419
WAIT #46912550761720: nam='control file parallel write' ela= 154 files=1 block#=5 requests=1 obj#=-1 tim=1725500750152584
WAIT #46912550761720: nam='control file parallel write' ela= 141 files=1 block#=6 requests=1 obj#=-1 tim=1725500750152736
WAIT #46912550761720: nam='control file parallel write' ela= 142 files=1 block#=7 requests=1 obj#=-1 tim=1725500750152891
WAIT #46912550761720: nam='control file parallel write' ela= 140 files=1 block#=8 requests=1 obj#=-1 tim=1725500750153042
WAIT #46912550761720: nam='control file parallel write' ela= 143 files=1 block#=9 requests=1 obj#=-1 tim=1725500750153196
WAIT #46912550761720: nam='control file parallel write' ela= 144 files=1 block#=10 requests=1 obj#=-1 tim=1725500750153351
WAIT #46912550761720: nam='control file parallel write' ela= 149 files=1 block#=11 requests=1 obj#=-1 tim=1725500750153511
WAIT #46912550761720: nam='control file parallel write' ela= 147 files=1 block#=12 requests=1 obj#=-1 tim=1725500750153669
WAIT #46912550761720: nam='control file parallel write' ela= 129 files=1 block#=13 requests=1 obj#=-1 tim=1725500750153811
WAIT #46912550761720: nam='control file sequential read' ela= 6 file#=0 block#=282 blocks=1 obj#=-1 tim=1725500750153828
WAIT #46912550761720: nam='control file parallel write' ela= 268 files=2 block#=281 requests=2 obj#=-1 tim=1725500750154103
对上面的10046 trace,做如下解释:
file# 0 ---表示control01
file# 1 ---表示control02
block# ---表示操作的block号
blocks ---表示操作的block个数
files ---表示操作的文件个数
从上面10046 的trace结果,我们可以得到如下内容:
1. 在数据库mount的时候,后台进程首先读取controlfile的第1个block,随后读取了controlfile的第3~13 个block.
2. 控制文件写入时,都是同时操作的.最后我们根据前面的strace和10046 跟踪来做一个简单地总结:
1) 在数据库mount的时刻,数据库会通过读取controlfile来确认datafile以及redo等信息。 这个读取操作是后台进程dbw0和lgwr来完成的.
2) 当数据库mount以后,ckpt进程会每3s触发去更新controlfile了.
3) 需要注意的是,相关后台进程在读取controlfile时,得知了datafile和redo等信息后,并不会去验证相关文件是否存在,在这个时候说明数据库已经准备open了,不过在open的时刻,才会去验证相关文件是否存在。
--oracle中第1个block是os block,存放操作系统信息,上面没业务数据,以上内容中说的block已跳过 os block:
--参考Doc ID 360032.1
1>The OS Block Header is in the first datafile block. It is used by Oracle to store Operating System information.
2>It is the Block Zero. It is not the datafile header which is in Oracle Block 1.
3>Corruption in Block Zero will not cause damage to the data and in versions lower than 11g it is not detected by dbverify/rman. In 11g Dbverify has been enhanced to detect it.
4>dbfsize can be used to check the consistency of Block 0.
5>When this block is corrupted the database may sometimes be opened with no errors as Block 0 is only checked by some specific database operations like 'CREATE CONTROLFILE' or in 11g during Database OPEN
最后修改时间:2024-10-10 12:31:01
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




