Greenplum 6已合并到PostgreSQL 9.3版本 - 比上一代GP提升：8倍读，195倍更新、删除 - 另有大量PG新特性

digoal 2018-09-26

380

作者

digoal

日期

2018-09-26

背景

Greenplum 6已合并到PostgreSQL 9.3版本 - 相比5性能提升：读8倍，更新、删除195倍 - 另有大量PG新特性，详见PostgreSQL release notes

https://www.postgresql.org/docs/11/static/release.html

例如

1、GIN、SPGIST 索引接口。在模糊查询、全文检索、向量相似计算方面终于有索引加速了。

2、异步事务，小事务写入有大幅提升。

3、物化视图，OLAP中很好用的功能。

4、事件触发器，基于事件控制细粒度DDL权限。

5、整体性能增强。

以上特性都是通过升级PostgreSQL版本加入的。

其他增强：

1、增加跨表的分布式死锁检测。

2、更新、删除由表级排他锁改成行级排他锁，大幅提升DML性能。

gpdb 6 部署示例

与如下同测试环境:

《Deepgreen(Greenplum) 多机部署测试 , TPC-H VS citus》

配置ECS虚拟机OS参数 (all host)

1、内核参数

```
vi /etc/sysctl.conf

add by digoal.zhou

fs.aio-max-nr = 1048576
fs.file-max = 76724600

可选：kernel.core_pattern = /data01/corefiles/core_%e_%u_%t_%s.%p

/data01/corefiles 事先建好，权限777，如果是软链接，对应的目录修改为777

kernel.sem = 4096 2147483647 2147483646 512000

信号量, ipcs -l 或 -u 查看，每16个进程一组，每组信号量需要17个信号量。

kernel.shmall = 107374182

所有共享内存段相加大小限制 (建议内存的80%)，单位为页。

kernel.shmmax = 274877906944

最大单个共享内存段大小 (建议为内存一半), >9.2的版本已大幅降低共享内存的使用，单位为字节。

kernel.shmmni = 819200

一共能生成多少共享内存段，每个PG数据库集群至少2个共享内存段

net.core.netdev_max_backlog = 10000
net.core.rmem_default = 262144

The default setting of the socket receive buffer in bytes.

net.core.rmem_max = 4194304

The maximum receive socket buffer size in bytes

net.core.wmem_default = 262144

The default setting (in bytes) of the socket send buffer.

net.core.wmem_max = 4194304

The maximum send socket buffer size in bytes.

net.core.somaxconn = 4096
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.tcp_keepalive_intvl = 20
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_time = 60
net.ipv4.tcp_mem = 8388608 12582912 16777216
net.ipv4.tcp_fin_timeout = 5
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syncookies = 1

开启SYN Cookies。当出现SYN等待队列溢出时，启用cookie来处理，可防范少量的SYN攻击

net.ipv4.tcp_timestamps = 1

减少time_wait

net.ipv4.tcp_tw_recycle = 0

如果=1则开启TCP连接中TIME-WAIT套接字的快速回收，但是NAT环境可能导致连接失败，建议服务端关闭它

net.ipv4.tcp_tw_reuse = 1

开启重用。允许将TIME-WAIT套接字重新用于新的TCP连接

net.ipv4.tcp_max_tw_buckets = 262144
net.ipv4.tcp_rmem = 8192 87380 16777216
net.ipv4.tcp_wmem = 8192 65536 16777216

net.nf_conntrack_max = 1200000
net.netfilter.nf_conntrack_max = 1200000

vm.dirty_background_bytes = 409600000

系统脏页到达这个值，系统后台刷脏页调度进程 pdflush（或其他）自动将(dirty_expire_centisecs/100）秒前的脏页刷到磁盘

默认为10%，大内存机器建议调整为直接指定多少字节

vm.dirty_expire_centisecs = 3000

比这个值老的脏页，将被刷到磁盘。3000表示30秒。

vm.dirty_ratio = 95

如果系统进程刷脏页太慢，使得系统脏页超过内存 95 % 时，则用户进程如果有写磁盘的操作（如fsync, fdatasync等调用），则需要主动把系统脏页刷出。

有效防止用户进程刷脏页，在单机多实例，并且使用CGROUP限制单实例IOPS的情况下非常有效。

vm.dirty_writeback_centisecs = 100

pdflush（或其他）后台刷脏页进程的唤醒间隔， 100表示1秒。

vm.swappiness = 0

不使用交换分区

vm.mmap_min_addr = 65536
vm.overcommit_memory = 0

在分配内存时，允许少量over malloc, 如果设置为 1, 则认为总是有足够的内存，内存较少的测试环境可以使用 1 .

vm.overcommit_ratio = 90

当overcommit_memory = 2 时，用于参与计算允许指派的内存大小。

vm.zone_reclaim_mode = 0

禁用 numa, 或者在vmlinux中禁止.

net.ipv4.ip_local_port_range = 40000 65535

本地自动分配的TCP, UDP端口号范围

fs.nr_open=20480000

单个进程允许打开的文件句柄上限

以下参数请注意

vm.extra_free_kbytes = 4096000

vm.min_free_kbytes = 2097152 # vm.min_free_kbytes 建议每32G内存分配1G vm.min_free_kbytes

如果是小内存机器，以上两个值不建议设置

vm.nr_hugepages = 66536

建议shared buffer设置超过64GB时使用大页，页大小 /proc/meminfo Hugepagesize

vm.lowmem_reserve_ratio = 1 1 1

对于内存大于64G时，建议设置，否则建议默认值 256 256 32

```

2、资源限制

```
vi /etc/security/limits.conf

nofile超过1048576的话，一定要先将sysctl的fs.nr_open设置为更大的值，并生效后才能继续设置nofile.

soft nofile 1024000
hard nofile 1024000
soft nproc unlimited
hard nproc unlimited
soft core unlimited
hard core unlimited
soft memlock unlimited
hard memlock unlimited
```

3、关闭透明大页，使用精准时钟（可选）

```
vi /etc/rc.local

touch /var/lock/subsys/local

if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi

tsc 时钟

echo tsc > /sys/devices/system/clocksource/clocksource0/current_clocksource
```

chmod +x /etc/rc.d/rc.local

部署gpdb (all host)

1、部署依赖(root执行)

```
rpm -ivh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

yum install -y coreutils glib2 lrzsz mpstat dstat sysstat e4fsprogs xfsprogs ntp readline-devel zlib-devel openssl-devel pam-devel libxml2-devel libxslt-devel python-devel tcl-devel gcc make smartmontools flex bison perl-devel perl-ExtUtils* openldap-devel jadetex openjade bzip2 curl-devel curl apr-devel apr cmake3 python git iotop perf gcc-c++ dstat bzip2-devel krb5-devel libcurl-devel libevent-devel libkadm5 libyaml-devel libxml2-devel openssl-devel perl-ExtUtils-Embed python-devel sysstat python-pip xerces-c-devel

pip install --upgrade pip

pip install paramiko pycrypto psutil lockfile pidlockfile
```

2、部署gpdb(root执行)

```
cd ~
git clone https://github.com/greenplum-db/gpdb
cd gpdb

pip install -r python-dependencies.txt

pip install -r python-developer-dependencies.txt

./configure --disable-orca --with-perl --with-python --with-libxml --prefix=/opt/gpdb6
make -j 128
make install
```

3、检查当前GPDB的pg版本

/opt/gpdb6/bin/psql -V psql (PostgreSQL) 9.3beta1

规划存储目录 (all host)

```
useradd digoal
passwd digoal

mkdir /data01/gpdb6
chown digoal:digoal /data01/gpdb6
```

初始化GPDB集群 (master host)

1、配置文件

```
vi /opt/gpdb6/greenplum_path.sh

追加

export MASTER_DATA_DIRECTORY=/data01/gpdb6/gp-1
export PGDATA=$MASTER_DATA_DIRECTORY
export PGHOST=127.0.0.1
export PGPORT=18000
export PGUSER=digoal
export PGPASSWORD=123
export PGDATABASE=postgres
```

2、环境变量

```
su - digoal

vi ~/.bash_profile

追加

. /opt/gpdb6/greenplum_path.sh
```

3、配置集群主机文件，包含所有节点(master, standby, segment, mirror hosts)

```
su - digoal

vi hostfile

digoal-citus-gpdb-test001
digoal-citus-gpdb-test002
digoal-citus-gpdb-test003
digoal-citus-gpdb-test004
digoal-citus-gpdb-test005
digoal-citus-gpdb-test006
digoal-citus-gpdb-test007
digoal-citus-gpdb-test008
digoal-citus-gpdb-test009
```

4、配置host认证

《Deepgreen(Greenplum) 多机部署测试 , TPC-H VS citus》

gpssh-exkeys -f ./hostfile

5、配置集群初始化文件

如果master host不想配置segment node，则需要修改一下以上hostfile，把master host去掉。

本例在所有主机上初始化segment.

```
vi cluster.conf

ARRAY_NAME="mpp1 cluster"
CLUSTER_NAME="mpp1 cluster"
MACHINE_LIST_FILE=hostfile
SEG_PREFIX=gp
DATABASE_PREFIX=gp
PORT_BASE=28000
declare -a DATA_DIRECTORY=(/data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6)
MASTER_HOSTNAME=digoal-citus-gpdb-test001
MASTER_DIRECTORY=/data01/gpdb6
MASTER_PORT=18000
IP_ALLOW=0.0.0.0/0
TRUSTED_SHELL=/usr/bin/ssh
CHECK_POINT_SEGMENTS=32
ENCODING=UNICODE
export MASTER_DATA_DIRECTORY
export TRUSTED_SHELL
DEFAULT_QD_MAX_CONNECT=250
QE_CONNECT_FACTOR=5
```

6、初始化集群

```
. /opt/gpdb6/greenplum_path.sh

gpinitsystem -c cluster.conf -h hostfile
```

7、配置参数

gpconfig -c max_connections -v 500 -m 400 gpconfig -c shared_buffers -v '1GB' gpconfig -c max_prepared_transactions -v '1500' gpconfig -c max_stack_depth -v '4MB' gpconfig -c vacuum_cost_delay -v '0' gpconfig -c synchronous_commit -v 'off' gpconfig -c wal_buffers -v '16MB' gpconfig -c wal_writer_delay -v '10ms' gpconfig -c checkpoint_segments -v '128' --skipvalidation gpconfig -c random_page_cost -v '1.3' gpconfig -c log_statement -v 'ddl' gpconfig -c vacuum_freeze_table_age -v '1200000000' gpconfig -c autovacuum_freeze_max_age -v '1300000000' --skipvalidation gpconfig -c autovacuum_vacuum_cost_delay -v '0' --skipvalidation gpconfig -c autovacuum -v 'on' --skipvalidation

重启实例

```
gpstop -M fast -a

gpstart -a
```

以下参数不允许修改，详见GUC文件

src/backend/utils/misc/guc.c

autovacuum autovacuum_freeze_max_age autovacuum_vacuum_cost_delay

GPDB 6 改进评测

1 支持异步事务

PostgreSQL 8.3 就有了，异步事务开启后，对于IO性能较差的盘，小事务的性能提升非常明显。

synchronous_commit = off wal_buffers = 16MB wal_writer_delay = 10ms

2 gin 倒排索引

GIN倒排索引，支持多值列（例如数组、JSON、HSTORE、全文检索），多列任意组合查询索引加速。

例子

```
postgres=# create table t(id int, c1 int[]);
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE

postgres=# create index idx_t_1 on t using gin (c1);
CREATE INDEX

postgres=# create or replace function gen_rand_arr(int,int) returns int[] as $$
postgres$# select array(select (random()*$1)::int from generate_series(1,$2));
postgres$# $$ language sql strict;
CREATE FUNCTION

postgres=# select gen_rand_arr(100,10);
gen_rand_arr

{3,85,71,73,91,2,29,81,69,77}
(1 row)

postgres=# insert into t select id,gen_rand_arr(10000,10) from generate_series(1,10000000) t(id);
INSERT 0 10000000

postgres=# explain analyze select * from t where c1 @> array[1,2];
QUERY PLAN

Gather Motion 4:1 (slice1; segments: 4) (cost=6.50..7.81 rows=1 width=65) (actual time=0.340..0.426 rows=5 loops=2)
-> Bitmap Heap Scan on t (cost=6.50..7.81 rows=1 width=65) (actual time=0.329..0.335 rows=2 loops=2)
Recheck Cond: (c1 @> '{1,2}'::integer[])
-> Bitmap Index Scan on idx_t_1 (cost=0.00..6.50 rows=1 width=0) (actual time=0.259..0.259 rows=0 loops=2)
Index Cond: (c1 @> '{1,2}'::integer[])
(slice0) Executor memory: 322K bytes.
(slice1) Executor memory: 779K bytes avg x 4 workers, 907K bytes max (seg1). Work_mem: 33K bytes max.
Memory used: 128000kB
Optimizer: legacy query optimizer
Total runtime: 1.208 ms
(10 rows)

set enable_bitmapscan =off;

postgres=# explain analyze select * from t where c1 @> array[1,2];
QUERY PLAN

Gather Motion 4:1 (slice1; segments: 4) (cost=0.00..16.50 rows=1 width=65) (actual time=503.962..636.780 rows=5 loops=2)
-> Seq Scan on t (cost=0.00..16.50 rows=1 width=65) (actual time=34.927..588.086 rows=2 loops=2)
Filter: (c1 @> '{1,2}'::integer[])
(slice0) Executor memory: 322K bytes.
(slice1) Executor memory: 54K bytes avg x 4 workers, 54K bytes max (seg0).
Memory used: 128000kB
Optimizer: legacy query optimizer
Total runtime: 1273.949 ms
(8 rows)
```

相比全表扫描，使用GIN索引的查询性能提升上千倍。

3 spgist 索引，范围类型

spgist索引接口，以及范围类型。

```
postgres=# create table t(id int, rg int4range);
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
postgres=# create index idx_t_1 on t using spgist (rg);
CREATE INDEX
postgres=# insert into t values (1, int4range(1,100));
INSERT 0 1
postgres=# insert into t values (2, int4range(101,200));
INSERT 0 1

postgres=# explain select * from t where rg @> 1;
QUERY PLAN

Gather Motion 4:1 (slice1; segments: 4) (cost=0.12..2.74 rows=1 width=18)
-> Index Scan using idx_t_1 on t (cost=0.12..2.74 rows=1 width=18)
Index Cond: (rg @> 1)
Optimizer: legacy query optimizer
(4 rows)

postgres=# select * from t where rg @> 1;
id | rg
----+---------
1 | [1,100)
(1 row)
```

范围类型在一些场景的应用

《会议室预定系统实践(解放开发) - PostgreSQL tsrange(时间范围类型) + 排他约束》

《PostgreSQL 黑科技 range 类型及 gist index 20x+ speedup than Mysql index combine query》

4 增加行级排他锁，优化分布式死锁检测

1、原有分布式死锁检测

```
postgres=# show deadlock_timeout ;
deadlock_timeout

1s
(1 row)

postgres=# create table a (id int, c1 int);
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
Time: 156.785 ms
postgres=# create table b (id int, c1 int);
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
Time: 87.583 ms

postgres=# insert into a values (1,1);
INSERT 0 1
Time: 58.180 ms
postgres=# insert into b values (1,1);
INSERT 0 1
Time: 72.548 ms

postgres=# begin;
BEGIN
postgres=# update b set c1=2 where id=1;
UPDATE 1
...
postgres=# update a set c1=2 where id=1;
UPDATE 1

postgres=# begin;
BEGIN
Time: 1.997 ms
postgres=# update a set c1=2 where id=1;
UPDATE 1
Time: 1.425 ms
...
postgres=# update b set c1=2 where id=1;
ERROR: deadlock detected
LINE 1: update b set c1=2 where id=1;
^
DETAIL: Process 26021 waits for ExclusiveLock on relation 36930 of database 12097; blocked by process 26091.
Process 26091 waits for ExclusiveLock on relation 36927 of database 12097; blocked by process 26021.
```

gpdb 6 与 gpdb 5 行为一致

```
postgres=# begin;
BEGIN
postgres=# update a set c1=2 where id=1;
UPDATE 1
...
postgres=# update b set c1=2 where id=1;
UPDATE 1

postgres=# begin;
BEGIN
postgres=# update b set c1=3 where id=1;
UPDATE 1
...
postgres=# update a set c1=3 where id=1;
ERROR: deadlock detected (seg2 127.0.0.1:24002 pid=3721)
DETAIL: Process 3721 waits for ShareLock on transaction 1306618; blocked by process 3703.
Process 3703 waits for ShareLock on transaction 1306619; blocked by process 3721.
HINT: See server log for query details.
```

2、gpdb6增加了行级锁

(gpdb6以前为表级排他锁)

对同一张表的delete\update操作，堵塞insert\update\delete

begin; update a set c1=2 where id=1;

堵塞其他会话对同一张表的如下操作：insert\update\delete：

update a set c1=3 where id=2; insert into a values (3,1); delete from a where id=2;

gpdb6

行级锁，以上操作不堵塞。

行级分布式死锁检测，参数gp_global_deadlock_detector_period

```
postgres=# show gp_global_deadlock_detector_period;
gp_global_deadlock_detector_period

2min
(1 row)

postgres=# create table a (id int, c1 int);
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
postgres=# insert into a values (1,1),(2,2);
INSERT 0 2
postgres=# begin;
BEGIN
postgres=# update a set c1=2 where id=1;
UPDATE 1
...
postgres=# update a set c1=1 where id=2;
UPDATE 1

postgres=# begin;
BEGIN
postgres=# update a set c1=3 where id=2;
UPDATE 1
...
postgres=# update a set c1=3 where id=1;
ERROR: canceling statement due to user request: "cancelled by global deadlock detector"
```

5 TP性能提升

1亿数据量，TPCB，只读，读写混合测试。

```
pgbench -i -s 1000 -h 127.0.0.1 -p 15432 -U postgres postgres

pgbench -M simple -v -r -P 1 -c 16 -j 16 -h 127.0.0.1 -p 15432 -U postgres postgres -T 120 -S

pgbench -M simple -v -r -P 1 -c 16 -j 16 -h 127.0.0.1 -p 15432 -U postgres postgres -T 120
```

1 gpdb 5

```
postgres=# select version();
version

PostgreSQL 8.3.23 (Greenplum Database 5.10.2+7615c3b build ga) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.3.1 20170216 (Red Hat 6.3.1-3), 64-bit compiled on Aug 25 2018 08:21:26
(1 row)

postgres=# show gp_server_version;
gp_server_version

5.10.2+7615c3b build ga
(1 row)

postgres=# show gp_server_version_num;
gp_server_version_num

51002
(1 row)
```

1、tpcb 只读

transaction type: <builtin: select only> scaling factor: 1000 query mode: simple number of clients: 16 number of threads: 16 duration: 120 s number of transactions actually processed: 361911 latency average = 5.306 ms latency stddev = 0.854 ms tps = 3014.205774 (including connections establishing) tps = 3014.474824 (excluding connections establishing) script statistics: - statement latencies in milliseconds: 0.002 \set aid random(1, 100000 * :scale) 5.303 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

2、tpcb 读写混合

transaction type: <builtin: TPC-B (sort of)> scaling factor: 1000 query mode: simple number of clients: 16 number of threads: 16 duration: 120 s number of transactions actually processed: 1822 latency average = 1057.754 ms latency stddev = 130.580 ms tps = 15.074113 (including connections establishing) tps = 15.075498 (excluding connections establishing) script statistics: - statement latencies in milliseconds: 0.005 \set aid random(1, 100000 * :scale) 0.001 \set bid random(1, 1 * :scale) 0.001 \set tid random(1, 10 * :scale) 0.001 \set delta random(-5000, 5000) 2.417 BEGIN; 990.377 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; 7.147 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; 1.186 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; 1.081 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; 0.674 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); 54.865 END;

2 gpdb 6

```
psql
psql (9.3beta1)
Type "help" for help.

postgres=# select version();
version

PostgreSQL 9.3beta1 (Greenplum Database 6.0.0-alpha.0+dev.11201.gb2e98d4 build dev-oss) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit compiled on Sep 26 2018 15:00:09
(1 row)

postgres=# show gp_server_version;
gp_server_version

6.0.0-alpha.0+dev.11201.gb2e98d4 build dev-oss
(1 row)

postgres=# show gp_server_version_num;
gp_server_version_num

60000
(1 row)
```

1、tpcb 只读

transaction type: <builtin: select only> scaling factor: 1000 query mode: simple number of clients: 16 number of threads: 16 duration: 120 s number of transactions actually processed: 3006326 latency average = 0.639 ms latency stddev = 0.107 ms tps = 25052.487094 (including connections establishing) tps = 25056.694155 (excluding connections establishing) script statistics: - statement latencies in milliseconds: 0.002 \set aid random(1, 100000 * :scale) 0.637 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

2、tpcb 读写混合

transaction type: <builtin: TPC-B (sort of)> scaling factor: 1000 query mode: simple number of clients: 16 number of threads: 16 duration: 120 s number of transactions actually processed: 351382 latency average = 5.465 ms latency stddev = 9.487 ms tps = 2927.029739 (including connections establishing) tps = 2927.497105 (excluding connections establishing) script statistics: - statement latencies in milliseconds: 0.002 \set aid random(1, 100000 * :scale) 0.001 \set bid random(1, 1 * :scale) 0.001 \set tid random(1, 10 * :scale) 0.000 \set delta random(-5000, 5000) 0.195 BEGIN; 0.779 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; 0.692 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; 0.703 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; 0.685 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; 0.566 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); 1.838 END;

小结

Greenplum 6已合并到PostgreSQL 9.3版本，相比5性能提升：读8倍，更新、删除195倍，同时有大量PG新特性，例如

1、GIN、SPGIST 索引接口。在模糊查询、全文检索、向量相似计算方面支持索引加速。 1000万行的数组检索，性能提升了1000倍。

2、支持异步事务，小事务写入有大幅提升。

3、支持物化视图，OLAP中很好用的功能。

4、事件触发器，基于事件控制细粒度DDL权限。

5、整体性能增强，OLTP 读8倍，更新、删除195倍。

以上特性都是通过升级PostgreSQL版本加入的。

其他增强：

1、增加跨表的分布式死锁检测。

2、更新、删除由表级排他锁改成行级排他锁，大幅提升DML性能。表现在OLTP方面，更新、删除混合测试TPCB 提升了195倍。

参考

https://www.postgresql.org/docs/11/static/release.html

https://github.com/greenplum-db/gpdb

《Deepgreen(Greenplum) 多机部署测试 , TPC-H VS citus》

《会议室预定系统实践(解放开发) - PostgreSQL tsrange(时间范围类型) + 排他约束》

《PostgreSQL 黑科技 range 类型及 gist index 20x+ speedup than Mysql index combine query》

文章转载自digoal，如果涉嫌侵权，请发送邮件至：contact@modb.pro进行举报，并提供相关证据，一经查实，墨天轮将立刻删除相关内容。

Greenplum 6已合并到PostgreSQL 9.3版本 - 比上一代GP提升：8倍读，195倍更新、删除 - 另有大量PG新特性

作者

日期

标签

背景

gpdb 6 部署示例

配置ECS虚拟机OS参数 (all host)

add by digoal.zhou

可选：kernel.core_pattern = /data01/corefiles/core_%e_%u_%t_%s.%p

/data01/corefiles 事先建好，权限777，如果是软链接，对应的目录修改为777

信号量, ipcs -l 或 -u 查看，每16个进程一组，每组信号量需要17个信号量。

所有共享内存段相加大小限制 (建议内存的80%)，单位为页。

最大单个共享内存段大小 (建议为内存一半), >9.2的版本已大幅降低共享内存的使用，单位为字节。

一共能生成多少共享内存段，每个PG数据库集群至少2个共享内存段

The default setting of the socket receive buffer in bytes.

The maximum receive socket buffer size in bytes

The default setting (in bytes) of the socket send buffer.

The maximum send socket buffer size in bytes.

开启SYN Cookies。当出现SYN等待队列溢出时，启用cookie来处理，可防范少量的SYN攻击

减少time_wait

如果=1则开启TCP连接中TIME-WAIT套接字的快速回收，但是NAT环境可能导致连接失败，建议服务端关闭它

开启重用。允许将TIME-WAIT套接字重新用于新的TCP连接

系统脏页到达这个值，系统后台刷脏页调度进程 pdflush（或其他） 自动将(dirty_expire_centisecs/100）秒前的脏页刷到磁盘

默认为10%，大内存机器建议调整为直接指定多少字节

比这个值老的脏页，将被刷到磁盘。3000表示30秒。

如果系统进程刷脏页太慢，使得系统脏页超过内存 95 % 时，则用户进程如果有写磁盘的操作（如fsync, fdatasync等调用），则需要主动把系统脏页刷出。

有效防止用户进程刷脏页，在单机多实例，并且使用CGROUP限制单实例IOPS的情况下非常有效。

pdflush（或其他）后台刷脏页进程的唤醒间隔， 100表示1秒。

不使用交换分区

在分配内存时，允许少量over malloc, 如果设置为 1, 则认为总是有足够的内存，内存较少的测试环境可以使用 1 .

当overcommit_memory = 2 时，用于参与计算允许指派的内存大小。

禁用 numa, 或者在vmlinux中禁止.

本地自动分配的TCP, UDP端口号范围

单个进程允许打开的文件句柄上限

以下参数请注意

vm.extra_free_kbytes = 4096000

vm.min_free_kbytes = 2097152 # vm.min_free_kbytes 建议每32G内存分配1G vm.min_free_kbytes

如果是小内存机器，以上两个值不建议设置

vm.nr_hugepages = 66536

建议shared buffer设置超过64GB时 使用大页，页大小 /proc/meminfo Hugepagesize

vm.lowmem_reserve_ratio = 1 1 1

对于内存大于64G时，建议设置，否则建议默认值 256 256 32

nofile超过1048576的话，一定要先将sysctl的fs.nr_open设置为更大的值，并生效后才能继续设置nofile.

tsc 时钟

部署gpdb (all host)

规划存储目录 (all host)

初始化GPDB集群 (master host)

追加

追加

GPDB 6 改进评测

1 支持异步事务

2 gin 倒排索引

3 spgist 索引，范围类型

4 增加行级排他锁，优化分布式死锁检测

5 TP性能提升

1 gpdb 5

2 gpdb 6

小结

参考

PostgreSQL 许愿链接

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

评论

相关阅读

系统脏页到达这个值，系统后台刷脏页调度进程 pdflush（或其他）自动将(dirty_expire_centisecs/100）秒前的脏页刷到磁盘

建议shared buffer设置超过64GB时使用大页，页大小 /proc/meminfo Hugepagesize