MogDB 离线数据抽取（基于mog_filedump）

原创范计杰 2022-09-05

950

[[toc]]

适用范围

当数据无法打开数据库读取时，可通过mog_filedump离线抽取数据文件中的数据。

步骤

1、抽取数据字典
2、使用mog_filedump抽取数据
本次示例：
抽取OMM库的tab表

1、抽取数据字典

通过pg_filenode.map 找核心数据字典 pg_database，pg_class，pg_attribute

1、抽取pg_database,找到database oid

==============pg_filenode.map 未损坏时=============

----global级别
select relname,oid,relfilenode from pg_class where relfilenode=0 and relkind='r' and pg_relation_filepath(oid) not like 'base_%' order by oid;

           relname            | oid  | relfilenode
------------------------------+------+-------------
 pg_pltemplate                | 1136 |           0
 pg_tablespace                | 1213 |           0
 pg_shdepend                  | 1214 |           0
 pg_authid                    | 1260 |           0
 pg_auth_members              | 1261 |           0
 pg_database                  | 1262 |           0 <<<<数据字典OID固定
 pg_shdescription             | 2396 |           0
 pg_db_role_setting           | 2964 |           0
 pg_resource_pool             | 3450 |           0
 pg_workload_group            | 3451 |           0
 pg_auth_history              | 3457 |           0
 pg_user_status               | 3460 |           0
 pg_app_workloadgroup_mapping | 3464 |           0
 pg_shseclabel                | 3592 |           0
 pg_extension_data_source     | 4211 |           0
 gs_obsscaninfo               | 5680 |           0
 pgxc_group                   | 9014 |           0
 pgxc_node                    | 9015 |           0
 pg_job                       | 9022 |           0
 pg_job_proc                  | 9023 |           0
 gs_global_config             | 9080 |           0
(21 rows)

./dump -decoder pg_filenode_map -bs 512 -file /opt/ogdata/pdata/global/pg_filenode.map
======read and dump block 0=======
@0    RELMAPPER_FILEMAGIC            size:4    hex:0x592717                       val:5842711
@4    num_mappings                   size:4    hex:0x3e                           val:62
@8    map1                           size:8    hex:[0x4ee                          0x3b14                        ] val:[1262 15124]
....省略

找到pg_database filenode=15124


mog_filedump -o -i -D name,oid,~ \
/opt/ogdata/pdata/global/15124 

----找到omm库的oid为16385
 Item   2 -- Length:  328  Offset: 6784 (0x1a80)  Flags: NORMAL
  XMIN: 9959  XMAX: 0  CID|XVAC: 0  OID: 16385 <<<<
  Block Id: 0  linp Index: 2   Attributes: 14   Size: 32
  infomask: 0x0909 (HASNULL|HASOID|XMIN_COMMITTED|XMAX_INVALID)
  t_bits: [0]: 0xff [1]: 0x2f

COPY: omm       10

2、抽取pg_class，pg_attribute找到表oid，及表列定义


omm=# select relname,oid,relfilenode from pg_class where relfilenode=0 and relkind='r' and pg_relation_filepath(oid) like 'base_%' order by oid;
   relname    | oid  | relfilenode
--------------+------+-------------
 pg_type      | 1247 |           0
 pg_attribute | 1249 |           0 <<<数据字典OID固定
 pg_proc      | 1255 |           0
 pg_class     | 1259 |           0 <<<
(4 rows)

./dump -decoder pg_filenode_map -bs 512 -file /opt/ogdata/pdata/base/16385/pg_filenode.map
@0    RELMAPPER_FILEMAGIC            size:4    hex:0x592717                       val:5842711
@4    num_mappings                   size:4    hex:0xf                            val:15
@8    map1                           size:8    hex:[0x4eb                          0x3988                        ] val:[1259 14728] <<<<pg_class filenode 14728
@16   map2                           size:8    hex:[0x4e1                          0x397d                        ] val:[1249 14717] <<<<pg_attribute  filenode 14717
@24   map3                           size:8    hex:[0x4e7                          0x3981                        ] val:[1255 14721]
@32   map4                           size:8    hex:[0x4df                          0x3932                        ] val:[1247 14642]
@40   map5                           size:8    hex:[0xb14                          0x3983                        ] val:[2836 14723]
@48   map6                           size:8    hex:[0xb15                          0x3985                        ] val:[2837 14725]
@56   map7                           size:8    hex:[0xa62                          0x397f                        ] val:[2658 14719]
@64   map8                           size:8    hex:[0xa63                          0x3980                        ] val:[2659 14720]
@72   map9                           size:8    hex:[0xa66                          0x398a                        ] val:[2662 14730]
@80   map10                          size:8    hex:[0xa67                          0x398b                        ] val:[2663 14731]
@88   map11                          size:8    hex:[0x26fd                         0x398c                        ] val:[9981 14732]
@96   map12                          size:8    hex:[0xa82                          0x3986                        ] val:[2690 14726]
@104  map13                          size:8    hex:[0xa83                          0x3987                        ] val:[2691 14727]
@112  map14                          size:8    hex:[0xa8f                          0x3934                        ] val:[2703 14644]
@120  map15                          size:8    hex:[0xa90                          0x3935                        ] val:[2704 14645]

-----抽取pg_calss
$ mog_filedump -o -D name,oid,oid,oid,oid,oid,oid,~ \
/opt/ogdata/pdata/base/16385/14728 | \
 grep COPY |grep tab

第7列为filenode 
 COPY: tab 2200    16388   0       10      0       16386

TODO: mog_filedump不支持抽取tuple的oid,也就读取不到pg_calss.oid,只能祈祷表没做过导致filenode改变的操作!!!

 -----抽取pg_attribute，找到表的列在定义，列数量
 要找到表的列定义，就需要找到表的oid,(在pg_class.oid)，但mog_filedump不支持抽取tuple的oid,也就读取不到pg_calss.oid,只能祈祷表没做过导致filenode改变的操作
$ mog_filedump -o -D oid,name,oid,int,smallint,smallint,~ \
/opt/ogdata/pdata/base/16385/14717 | \
 grep COPY |grep 16386

16386为tab的oid

COPY: 16386     id      20      -1      8       1
COPY: 16386     c       1043    -1      -1      2
后面的列为内部隐藏列,不需要关注
COPY: 16386     ctid    27      0       6       -1
COPY: 16386     xmin    28      0       8       -3
COPY: 16386     cmin    29      0       4       -4
COPY: 16386     xmax    28      0       8       -5
COPY: 16386     cmax    29      0       4       -6
COPY: 16386     tableoid        26      0       4       -7
COPY: 16386     xc_node_id      23      0       4       -8

第三列是字段的类型oid,可以通过pg_type的到列的具体类型。

omm=# select distinct oid,typname from pg_type where oid in (20,1043);
 oid  | typname
------+---------
   20 | int8
 1043 | varchar
(2 rows)

omm=# select distinct oid,typname from pg_type order by oid;
  oid  |                typname
-------+----------------------------------------
    16 | bool
    17 | bytea
    18 | char
    19 | name
    20 | int8
    21 | int2
    22 | int2vector
    23 | int4
    24 | regproc
    25 | text
    26 | oid
    27 | tid
    28 | xid
    29 | cid
    30 | oidvector
    31 | xid32
    32 | oidvector_extend
    71 | pg_type
    75 | pg_attribute
    81 | pg_proc
    83 | pg_class
    86 | raw
    87 | _raw
    88 | blob
    90 | clob
 ....

2、使用mog_filedump抽取数据

$ mog_filedump -o -D int,varchar /opt/ogdata/pdata/base/16385/16386 |  grep COPY|perl -lne 's/^COPY: //g; print;'|tee -a tab.dat
4       test4
5       test5
6       test6
7       test7
8       test8
9       test9
10      test10
11      test11
12      test12
13      test13
14      test14
15      test15
16      test16
17      test17
18      test18
....

注意我们增加了一个-o选项来剔除死元组记录
create table tab (id int,c varchar(100));
\copy tab from '/home/omm/tab.dat'
select * from tab ;

参考文档

https://www.modb.pro/db/365981

mogdb

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

文章被以下合辑收录

MogDB学习（共11篇）

mogdb学习