暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL 20200819当天代码 - 14 对比 13 高并发性能优化 数据对比 - get snapshot improve

digoal 2020-08-17
1190

作者

digoal

日期

2020-08-17

标签

PostgreSQL , snapshot , 高并发性能优化


背景

https://www.citusdata.com/blog/2020/10/25/improving-postgres-connection-scalability-snapshots/

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=73487a60fc1063ba4b5178b69aee4ee210c182c4

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=5788e258bb26495fab65ff3aa486268d1c50b123

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=941697c3c1ae5d6ee153065adb96e1e63ee11224

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=1f51c17c68d05c28d5b9294d8013cb9e7e653160

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=dc7420c2c9274a283779ec19718d2d16323640c0

《PostgreSQL 14 GetSnapshotData 高并发优化, 以及 64位xid避免xid wrap优化》

对比数据来自 20200817 当天的PG 13和14代码版本.

https://www.postgresql.org/ftp/snapshot/

测试case

./configure --prefix=/home/digoal/pg? --enable-debug gmake world gmake install-world initdb -D $PGDATA -U postgres -E UTF8 --lc-collate=C --lc-ctype=en_US.utf8

```
sysctl -w vm.nr_hugepages=8700

postgresql.conf

listen_addresses = '0.0.0.0'
port = 41921
max_connections = 10000
superuser_reserved_connections = 3
unix_socket_directories = '/tmp,.'
tcp_keepalives_idle = 60
tcp_keepalives_interval = 10
tcp_keepalives_count = 10
shared_buffers = 16GB
huge_pages = on
maintenance_work_mem = 1GB
dynamic_shared_memory_type = posix
vacuum_cost_delay = 0
bgwriter_delay = 10ms
bgwriter_lru_maxpages = 1000
bgwriter_lru_multiplier = 10.0
effective_io_concurrency = 0
max_worker_processes = 128
max_parallel_workers_per_gather = 0
wal_level = minimal
synchronous_commit = off
full_page_writes = on
wal_compression = on
wal_buffers = 16MB
wal_writer_delay = 10ms
max_wal_size = 64GB
min_wal_size = 8GB
max_wal_senders = 0
random_page_cost = 1.1
effective_cache_size = 256GB
log_destination = 'csvlog'
logging_collector = on
log_truncate_on_rotation = on
log_checkpoints = on
log_timezone = 'Asia/Shanghai'
log_autovacuum_min_duration = 0
autovacuum_vacuum_cost_delay = 0ms
datestyle = 'iso, mdy'
timezone = 'Asia/Shanghai'
lc_messages = 'en_US.utf8'
lc_monetary = 'en_US.utf8'
lc_numeric = 'en_US.utf8'
lc_time = 'en_US.utf8'
default_text_search_config = 'pg_catalog.english'
```

```
pgbench -i -s 1000

pgbench -M prepared -n -r -P 5 -c ? -j ? -T 120 -S

pgbench -M prepared -n -r -P 5 -c ? -j ? -T 120
```

对比

52C 104线程 384GB机器. PG 14 tpcb 1亿数据量 select only 200万qps. 峰值, 2000连接后, 性能提升20%以上.

Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 104 On-line CPU(s) list: 0-103 Thread(s) per core: 2 Core(s) per socket: 26 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz Stepping: 7 CPU MHz: 1665.649 CPU max MHz: 2500.0000 CPU min MHz: 1200.0000 BogoMIPS: 5000.00 Hypervisor vendor: KVM Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 1024K L3 cache: 36608K NUMA node0 CPU(s): 0-51 NUMA node1 CPU(s): 52-103 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 ida arat avx512_vnni

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=623a9ba79bbdd11c5eccb30b8bd5c446130e521c

``` snapshot scalability: cache snapshots using a xact completion counter. author Andres Freund andres@anarazel.de
Tue, 18 Aug 2020 12:07:10 +0800 (21:07 -0700) committer Andres Freund andres@anarazel.de
Tue, 18 Aug 2020 12:08:30 +0800 (21:08 -0700) commit 623a9ba79bbdd11c5eccb30b8bd5c446130e521c tree ed9f9fc1fd58e33a7c7d1fe5c037e08f64d6ddcd tree | snapshot parent 51300b45db95b6fd29f88534ab0739fdc9df1699 commit | diff snapshot scalability: cache snapshots using a xact completion counter.

Previous commits made it faster/more scalable to compute snapshots. But not building a snapshot is still faster. Now that GetSnapshotData() does not maintain RecentGlobal* anymore, that is actually not too hard:

This commit introduces xactCompletionCount, which tracks the number of top-level transactions with xids (i.e. which may have modified the database) that completed in some form since the start of the server.

We can avoid rebuilding the snapshot's contents whenever the current xactCompletionCount is the same as it was when the snapshot was originally built. Currently this check happens while holding ProcArrayLock. While it's likely possible to perform the check without acquiring ProcArrayLock, it seems better to do that separately / later, some careful analysis is required. Even with the lock this is a significant win on its own.

On a smaller two socket machine this gains another ~1.03x, on a larger machine the effect is roughly double (earlier patch version tested though). If we were able to safely avoid the lock there'd be another significant gain on top of that.

Author: Andres Freund andres@anarazel.de Reviewed-By: Robert Haas robertmhaas@gmail.com Reviewed-By: Thomas Munro thomas.munro@gmail.com Reviewed-By: David Rowley dgrowleyml@gmail.com Discussion: https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de ```

测试仅针对getsnapshot损耗在整体损耗中占比较高的场景.

模型 | 连接数 | PG 13 tps | PG 14 tps | 14性能提升比率
---|---|---|---|---
只读 1亿 -c 52 -j 52 -T 120 | 52 | 1234936 | 1582846 | 28.2%
只读 1亿 -c 104 -j 104 -T 120 | 104 | 1611162 | 2062788 | 28%
只读 1亿 -c 208 -j 104 -T 120 | 208 | 1803801 | 2045995 | 13.4%
只读 1亿 -c 512 -j 64 -T 120 | 512 | 1713561 | 1839027 | 7.3%
只读 1亿 -c 2048 -j 64 -T 360 | 2048 | 1279550 | 1686426 | 31.8%
只读 1亿 -c 5000 -j 50 -T 360 | 5000 | 1164421 | 1595496 | 37%

PostgreSQL 14 GetSnapshotData 的优化代码还没有完全合并到master, 后期数据应该更好,

https://www.postgresql.org/message-id/flat/20200301083601.ews6hz5dduc3w2se%40alap3.anarazel.de

```
以下是作者自己分支的测试数据

conns tps master tps pgxact-split
1 26842.492845 26524.194821
10 246923.158682 249224.782661
50 695956.539704 709833.746374
100 1054727.043139 1903616.306028
200 964795.282957 1949200.338012
300 906029.377539 1927881.231478
400 845696.690912 1911065.369776
500 812295.222497 1926237.255856
600 888030.104213 1903047.236273
700 866896.532490 1886537.202142
800 863407.341506 1883768.592610
900 871386.608563 1874638.012128
1000 887668.277133 1876402.391502
1500 860051.361395 1815103.564241
2000 890900.098657 1775435.271018
3000 874184.980039 1653953.817997
4000 845023.080703 1582582.316043
5000 817100.195728 1512260.802371
```

PostgreSQL 许愿链接

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

digoal's wechat

文章转载自digoal,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论