暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL 物化视图(Oracle同步到PG,PG同步到PG) - by pgsnapshot (plperlu trigger) (支持类似Oracle的mvlog fast complete force刷新)

digoal 2018-12-17
965

作者

digoal

日期

2018-12-17

标签

PostgreSQL , 物化视图 , 增量刷新 , mvlog , Oracle 同步到 PG , PG 同步到 PG


背景

PostgreSQL自身的物化视图没有MVLOG,也就是说,刷新的时候是VIEW定义产生的记录与MV已刷新的记录进行比对,进行增量更新的过程。

PG的物化视图增量刷新,在MV记录数较少,并且更新频繁时,刷新效率更高。

而Oracle的mv,创建后,支持增量刷新是通过上游的变更量(mvlog),进行回放,所以在MV记录数非常多,并且更新(写入)频率较低时,效率更高。

PG通过pgsnapshot插件,在发布端对变更表创建plperlu的触发器,记录变更量到MVLOG中,可以实现与ORACLE类似的增量刷新机制。

实际上pg 的 londiste3插件也是类似的功效(基于触发器的mq增量)

pgsnapshot的使用

1、下载

https://www.postgresql.org/ftp/projects/pgFoundry/snapshot/snapshot/

```
wget https://ftp.postgresql.org/pub/projects/pgFoundry/snapshot/snapshot/Pgsnapshot-0.4.x/pgsnapshot-0.4.0.tgz

tar -zxvf pgsnapshot-0.4.0.tgz
```

2、依赖什么

1、PostgreSQL 8.0以上版本,

2、plperlu 数据库函数语言,

3、Perl 5.8.5或更高版本,

4、Make sure that DBI is installed on your Perl system and that the DBD of
the database you choose is also installed;

3、配置pgsnapshot

配置Makefile.sh,生成SQL文件。

1、Edit the Makefile.sh file and change the KEY variable to a better "secret"
value and the BASE_SCHEMA variable to where the base(internal) Pg::Snapshot tables should be placed.
Also remember to setup the remaining variables like SUPERUSER.

```

!/bin/sh

You may change the following variables to best suit your needs

KEY is the communication key between any two servers

or between a DBA workstation and a server

配置密钥(用于通讯)

KEY=Digoalpwd123321PGDB_2018

BASE_SCHEMA is the schema where all base(internal) tables of Pg::Snapshot will be placed

配置pgsnapshot的元数据、函数的SCHEMA

BASE_SCHEMA=_pgmvlog

LOCAL postgresql server superuser

配置订阅端的数据库超级用户名

SUPERUSER=postgres

You may not need to change anything beyond this line

if [ "$1" == "clean" ]; then
echo "Removing pgsnapshots.sql..."
rm -f pgsnapshots.sql
echo "Removing previously generated SQL drivers..."
find drivers -name snapshot.sql | xargs rm -f
echo "Cleaned."
exit 0
fi

function apply {
cat $1 | awk '/^INCLUDE .$/ { system("cat src/pl/"$2"")} !/^INCLUDE .$/ {print}' | sed "s/%BASE_SCHEMA%/$BASE_SCHEMA/g" | sed "s/%COMMUNICATION_KEY%/$KEY/g" | sed "s/%SUPERUSER%/$SUPERUSER/g"
}

IFS=' '
SQLS='pgsnapshots_tables.sql pgsnapshots_dblink.sql pgsnapshots_create_snapshot.sql pgsnapshots_drop_snapshot.sql pgsnapshots_refresh_snapshot.sql pgsnapshots_snapshotlog.sql'

rm -f pgsnapshots.sql
for F in $SQLS; do
#echo $F
apply src/sql/$F >> pgsnapshots.sql
done
IFS=$'\n\t '
for F in find drivers -name snapshot.template.sql; do
OUTFILE=echo "$F" | sed "s/\.template\.sql/.sql/"
cat $F | sed "s/%BASE_SCHEMA%/$BASE_SCHEMA/g" | sed "s/%COMMUNICATION_KEY%/$KEY/g" > $OUTFILE
done
echo "Done."
```

2、执行Makefile.sh ,生成SQL文件。

4、发布端

1、需要在发布端(例如你在源库的postgres库,有一些表要发布)创建plperlu函数语言,

```
postgres@pg11-test-> psql
psql (11.1)
Type "help" for help.

db1=# create language plperlu;
CREATE LANGUAGE
```

2、执行Makefile.sh生成的两个SQL文件。(在发布端(例如你在源库的postgres库,有一些表要发布))

psql -d <database> -h <host> -U <user> -f ./drivers/pg/snapshot.sql

psql -d <database> -h <host> -U <user> -f pgsnapshots.sql

对于ORACLE的发布端来说,需要执行如下

SQL> @./drivers/oracle/snapshot.sql

3、配置pg_hba.conf,允许订阅端访问发布端数据库。

```
vi $PGDATA/pg_hba.conf

host all all 0.0.0.0/0 md5

pg_ctl reload -D $PGDATA
```

4、发布要被订阅的表(注意表需要主键)。

在发布端的目标库执行((在发布端(例如你在源库的postgres库,有一些表要发布))),允许订阅端连接到发布端

select _pgmvlog.snapshot_do('<key>', 'ALLOW', '<masterschema>', '<mastername>', '<ip>');

对于ORACLE的发布端来说,需要执行如下

SQL> begin snapshot_do('<key>', 'ALLOW', '<masterschema>', '<mastername>', '<ip>'); end; /

Where: <key> is the "secret" value placed on the KEY variable inside the Makefile.sh file. Makefile.sh中配置的密码,用于物化视图通讯 <masterschema> is the schema name of the master table you wish to allow access to <mastername> is the name of the master table you wish to allow access to <ip> is the IP address of your workstation/server to whom you wish to give access 订阅端的出口IP地址

例子(发布src表,允许订阅端访问,假设订阅端的IP为192.168.3.198

```
db1=# create table src (id int primary key, info text, crt_time timestamp);
CREATE TABLE

db1=# select _pgmvlog.snapshot_do('Digoalpwd123321PGDB_2018', 'ALLOW', 'public', 'src', '192.168.3.198');
snapshot_do


t
(1 row)
```

5、创建一个角色,赋予发布表的读权限给这个用户,这个用户需要在订阅端使用,用来连接发布端,复制发布表的数据。

```
create role mv_ro login encrypted password 'abcdefg';

grant select on public.src to mv_ro;
```

5、订阅端

1、创建plperlu 函数语言

```
postgres@pg11-test-> psql
psql (11.1)
Type "help" for help.

db2=# create language plperlu;
CREATE LANGUAGE
```

2、创建SERVER,取个名字,比如server1(代表发布端连接方式,(假设发布端的IP为192.168.3.199,数据库监听端口为1921))。

```
db2=# SELECT _pgmvlog.create_dblink('server2', 'dbi:Pg:dbname=db1;host=192.168.3.199;port=1921', 'mv_ro', 'abcdefg', '{AutoCommit => 0}');
create_dblink


t
(1 row)

db2=# select * from _pgmvlog.pg_dblinks;
dblinkid | dblinkname | datasource | username | password | attributes | ctime
----------+------------+--------------------------------------------+--------------+----------+-------------------+----------------------------
1 | server2 | dbi:Pg:dbname=db1;host=192.168.3.199;port=1921 | mv_ro | abcdefg | {AutoCommit => 0} | 2018-12-17 21:54:29.55201
(1 rows)
```

3、创建物化视图。全量刷新。(有三种刷新方法COMPLETE , FORCE , FAST)

```
-- 创建物化视图
SELECT _pgmvlog.create_snapshot('public', 'mv_tbl_src', 'select * from public.src', 'server2', 'COMPLETE', null);

-- 刷新物化视图
SELECT _pgmvlog.refresh_snapshot('public', 'mv_tbl_src');
```

4、创建物化视图。增量刷新。

select _pgmvlog.drop_snapshot('public','mv_tbl_src'); -- 删除物化视图

发布端(创建MVLOG)

```
select _pgmvlog.create_snapshot_log('public','src', 'PRIMARY KEY');

db1=# \dp+
Access privileges
Schema | Name | Type | Access privileges | Column privileges | Policies
--------+--------------+-------+---------------------------+-------------------+----------
public | mlog$_src | table | postgres=arwdDxt/postgres+| |
| | | mv_ro=r/postgres | |
public | src | table | postgres=arwdDxt/postgres+| |
| | | mv_ro=r/postgres | |
(4 rows)
```

订阅端(创建增量刷新物化视图)

```
-- 创建物化视图
SELECT _pgmvlog.create_snapshot('public', 'mv_tbl_src', 'select * from public.src', 'server2', 'FAST', null);

-- 或

SELECT _pgmvlog.create_snapshot('public', 'mv_tbl_src', 'select * from public.src', 'server2', 'FORCE', null); -- 全清MVLOG

-- 创建物化视图索引
db2=# create unique index idx_mv_tbl_src_1 on mv_tbl_src (id);
CREATE INDEX

-- 增量刷新物化视图
SELECT _pgmvlog.refresh_snapshot('public', 'mv_tbl_src');
```

刷新速度

```
db1=# insert into src select generate_series(40002,400000), md5(random()::text), now();
INSERT 0 359999

db1=# \c db2
You are now connected to database "db2" as user "postgres".
db2=# SELECT _pgmvlog.refresh_snapshot('public', 'mv_tbl_src');
NOTICE: Refreshed 359999 records in 16 seconds.
refresh_snapshot


t
(1 row)
```

Oracle 增量同步到PG (在PG中创建ORACLE的物化视图)

类似。

pgsnapshot手册

SAMPLES

USAGE.txt

USAGE.html

注意

版本比较老,生产就不要用了,不过有兴趣的同学可以借鉴这个思路把这个插件再维护一下。

可从mysql, oracle, pg等增量同步到PG。

参考

《使用Londiste3 增量同步 线下PostgreSQL 到 阿里云RDS PG》

《londiste3 copy table' snapshot & PostgreSQL logical replication's snapshot 不同之处》

《Londiste 3 replicate case - 1 下节》

《Londiste 3 replicate case - 1 上节》

《Londiste3 Install》

https://www.postgresql.org/ftp/projects/pgFoundry/snapshot/snapshot/

PostgreSQL 许愿链接

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

digoal's wechat

文章转载自digoal,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论