10年+资深DBA&数据库架构师; OpenGauss技术布道专家; PG/MySQL内核技术推广者; 著有《Oracle高性能自动化运维》; 拥有多个国产数据库官方认证; 致力于开源MySQL/PG技术推广/国产信创交付落地以及最佳实践探索; 个人公众号:dba悠然。
一、pg_profile用途
pg_profile基于postgres标准统计视图,通过捕获数据库中被任意服务器进程执行的命令、收集关于表和索引访问的统计信息、监控对块读写次数、对用户定义函数使用的跟踪等四个方面来进行性能分析汇总展示。它类似于Oracle AWR架构,指定时间间隔生成快照,并提供html文档进行分析。
二、安装部署
1.依赖extension
pg_profile需要使用dblink和pg_stat_statements扩展插件,pg数据库自带插件,不需要额外下载编译。
2.安装部署
2.1安装pg_profile驱动
//postgres用户:
tar -zxvf pg_profile—0.3.6.tar.gz
cp pg_profile* usr/local/pgsq/share/extension
2.2配置参数
vi postgresql.conf
###pg_stat_statements
shared_preload_libraries = 'pg_stat_statements.so'
pg_stat_statements.max = 10000
pg_stat_statements.track = top
pg_stat_statements.save = on
pg_stat_statements.track_utility = on
track_io_timing = on
track_activity_query_size = 2048
###pg_profile
track_activities = on
track_counts = on
track_io_timing = on
track_wal_io_timing = on # Since Postgres 14
track_functions = all
复制
2.3创建extension
//pg_profile/pg_sata_statements/db_link
postgres=# CREATE EXTENSION dblink;
postgres=# CREATE EXTENSION pg_stat_statements;
postgres=# CREATE SCHEMA profile;
postgres=# CREATE EXTENSION pg_profile SCHEMA profile;
--查看extension
postgres=# \dx
List of installed extensions
Name | Version | Schema | Description
--------------------+---------+------------+--------------------------------------------------------------
dblink | 1.2 | public | connect to other PostgreSQL databases from within a database
pageinspect | 1.7 | public | inspect the contents of database pages at a low level
pg_buffercache | 1.3 | public | examine the shared buffer cache
pg_prewarm | 1.2 | public | prewarm relation data
pg_profile | 4.2 | profile | PostgreSQL load profile repository and report builder
pg_stat_statements | 1.7 | public | track execution statistics of all SQL statements executed
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
(7 rows)
postgres=# \dx+ pg_profile
Objects in extension "pg_profile"
Object description
-----------------------------------------------------------------------------------------------------------
function profile.check_stmt_all_setting(integer,integer,integer)
function profile.check_stmt_cnt(integer,integer,integer)
function profile.cluster_stats(integer,integer,integer)
function profile.cluster_stats_diff_htbl(jsonb,integer,integer,integer,integer,integer)
function profile.cluster_stats_htbl(jsonb,integer,integer,integer)
function profile.cluster_stats_reset(integer,integer,integer)
function profile.cluster_stats_reset_diff_htbl(jsonb,integer,integer,integer,integer,integer)
function profile.cluster_stats_reset_htbl(jsonb,integer,integer,integer)
function profile.collect_obj_stats(jsonb,integer,integer,text,boolean)
function profile.collect_pg_stat_statements_stats(jsonb,integer,integer,integer)
function profile.collect_queries(oid,oid,bigint)
function profile.create_baseline(character varying,integer,integer,integer)
function profile.create_baseline(character varying,tstzrange,integer)
function profile.create_baseline(name,character varying,integer,integer,integer)
function profile.create_baseline(name,character varying,tstzrange,integer)
function profile.create_server(name,text,boolean,integer,text)
function profile.dbstats(integer,integer,integer,integer)
function profile.dbstats_diff_htbl(jsonb,integer,integer,integer,integer,integer,integer)
function profile.dbstats_htbl(jsonb,integer,integer,integer,integer)
function profile.dbstats_reset(integer,integer,integer)
function profile.dbstats_reset_diff_htbl(jsonb,integer,integer,integer,integer,integer)
function profile.dbstats_reset_htbl(jsonb,integer,integer,integer)
……
table profile.sample_settings
table profile.sample_stat_archiver
table profile.sample_stat_cluster
table profile.sample_stat_database
table profile.sample_stat_indexes
table profile.sample_stat_indexes_total
table profile.sample_stat_tables
table profile.sample_stat_tables_total
table profile.sample_stat_tablespaces
table profile.sample_stat_user_func_total
table profile.sample_stat_user_functions
table profile.sample_stat_wal
table profile.sample_statements
table profile.sample_statements_total
table profile.sample_timings
table profile.samples
table profile.servers
table profile.stmt_list
table profile.tables_list
table profile.tablespaces_list
view profile.v_sample_settings
view profile.v_sample_stat_indexes
view profile.v_sample_stat_tables
view profile.v_sample_stat_tablespaces
view profile.v_sample_stat_user_functions
view profile.v_sample_timings
(217 rows)
复制
3.创建快照
postgres=# SELECT profile.snapshot() ;
snapshot
------------------------
(local,OK,00:00:00.64)
(1 row)
复制
4.查询快照
postgres=# select profile.show_samples();
show_samples
-----------------------------------
(1,"2023-07-18 15:59:31-07",t,,,)
(2,"2023-07-18 15:59:33-07",t,,,)
(3,"2023-07-18 15:59:35-07",t,,,)
(4,"2023-07-18 15:59:36-07",t,,,)
(5,"2023-07-18 15:59:38-07",t,,,)
(6,"2023-07-18 15:59:41-07",t,,,)
(7,"2023-07-18 15:59:45-07",t,,,)
(7 rows)
postgres=# select * from profile.show_samples();
sample | sample_time | sizes_collected | dbstats_reset | clustats_reset | archstats_reset
--------+------------------------+-----------------+---------------+----------------+-----------------
1 | 2023-07-18 09:13:24-07 | t | | |
2 | 2023-07-18 09:13:26-07 | t | | |
3 | 2023-07-18 09:13:27-07 | t | | |
4 | 2023-07-18 09:13:28-07 | t | | |
5 | 2023-07-18 09:13:29-07 | t | | |
6 | 2023-07-18 09:13:29-07 | t | | |
7 | 2023-07-18 09:13:30-07 | t | | |
8 | 2023-07-18 09:14:13-07 | t | | |
9 | 2023-07-18 09:14:15-07 | t | | |
(9 rows)
复制
5.awr报告
5.1awr报告生成
psql -qtc “select profile.get_report(1,7)” -o 1-7.html
1-7.html
5.2awr报告对比
psql -qtc “select profile.get_diffreport(1,2,3,4)” -o awr_report_postgres_16_17.html
6. 异常处理
获取快照报错:“could not establish connection”
//pg_profile—0.3.6.sql
1)源码找到138行
..
CREATE FUNCTION take_sample(IN sserver_id integer, IN skip_sizes boolean
...
-- Server connection
PERFORM dblink_connect('server_connection',server_connstr); --FUNCTION take_sample -->line 138
复制
2)错误分析
调试dblink_connect方法无法连接数据库
3)查看配置pg_hba.conf
host all all 0.0.0.0/0 md5
4)修改认证方式trust
host all all 0.0.0.0/0 trust
5)问题解决
三、报告解析
分析报告从16:08开始,到16:32结束持续24分钟:
1. Server statistics
在采样时间24min内:
1)采样过程中增长了1089MB;
2)总共事务数6326008(4595161+1730846),平均每秒事务4393,事务回滚率为27%
1)ckpt_timeout 1500s
2)ckpt写时间耗费684秒
3)ckpt刷盘写数据2.69MB/s
4)平均归档日志生成量1.39MB/s
2. SQL query statistics
TOP SQL:
f3cbc714ba:执行1923 次,平均执行时长2145ms,最大执行时长6080ms;
cd47a0af8c执行1856次,平均执行时长2082ms,最大执行时长6656ms;
//详细SQL语句
SELECT A.fk_gson_coverbillno AS "gson_coverbillno",
A.fmodifierid AS "modifier",
A.fk_gson_bizday AS "gson_bizday",
B.fk_gson_entrylimit AS "entry.gson_entrylimit",
A.fk_gson_sourcesystem AS "gson_sourcesystem",
A.fk_audit_status AS "audit_status",
A.fk_gson_isfrombizsys AS "gson_isfrombizsys",
A.fauditdate AS "auditdate",
A.fsettlestatus AS "settlestatus",
A.fk_gson_srcbilltypeid AS "gson_srcbilltype",
A.fsettlementtypeid AS "settlementtype",
A.frecorgid AS "recorg",
B.ftaxrateid AS "entry.taxrateid",
A.fbookdate AS "bookdate",
A.fk_gson_absurdentry AS "gson_absurdentry",
B.fk_gson_businessperiod AS "_f_332806439",
A.fasstactid AS "asstact",
A.fk_auditable_amounts AS "auditable_amounts",
A.FId AS "id",
B.FEntryId AS "entry.id",
B.fk_auditedamount AS "entry.auditedamount",
A.fbillstatus AS "billstatus",
A.fisvoucher AS "isvoucher",
B.fk_gson_supplierid AS "entry.gson_supplier",
B.fk_gson_taxpayertype AS "_f_885504534",
A.ftax AS "tax",
B.fk_principal_amount AS "entry.principal_amount",
A.fsalesorgid AS "salesorg",
A.fk_gson_vouchernum AS "gson_vouchernum",
B.fk_business_items AS "entry.business_items",
B.fk_gson_recipients AS "entry.gson_recipients",
A.fk_gson_isunvoucher AS "gson_isunvoucher",
A.fbillno AS "billno",
A.fk_business_period AS "gson_business_period",
A.fisperiod AS "isperiod",
A.fk_gson_auditperiod AS "gson_auditperiod",
A.fmodifytime AS "modifytime",
B.fexpenseitemid AS "entry.e_expenseitem",
B.famount AS "entry.e_amount",
A.fremark AS "remark",
B.ftax AS "entry.e_tax",
B.fk_gson_contractbillno AS "_f_1120494784",
A.fbilltypeid AS "billtype",
A.fk_gson_easnumber AS "gson_easnumber",
B.frecamount AS "entry.e_recamount",
A.fcurrencyid AS "currency",
A.fk_gson_companyorgid AS "gson_companyorg",
B.funsettleamt AS "entry.e_unsettleamt",
A.fcreatetime AS "createtime",
A.famount AS "amount",
A.fcreatorid AS "creator",
A.forgid AS "org",
A.fk_gson_easvoucherstatus AS "gson_easvoucherstatus",
A.fasstacttype AS "asstacttype",
A.fauditorid AS "auditor",
A.fbizdate AS "bizdate",
A.funsettleamount AS "unsettleamount",
A.fk_gson_bizbillno AS "gson_bizbillno",
A.fk_audited_amount AS "audited_amount",
A.frecamount AS "recamount",
A.fsourcebillno AS "sourcebillno"
FROM t_ar_finarbill$156 a
LEFT OUTER JOIN t_ar_finarbillentry$156 b
ON B.FId = A.FId
WHERE (((A.fisvoucher = $1 AND (A.fbizdate >= $2)) AND (A.fbizdate < $3)) AND
A.forgid = $4)
ORDER BY A.fbizdate DESC LIMIT $5
复制
3.Schema object statistics
TOP表顺序读排序
TOP表block读取排序
TOP表DML排序
4.Vacuum-related statistics
5.Cluster settings during the report interval
6.TOP SQL分析及优化建议
结合SQL查询分析结果:
1) seq全表扫描
2) 基于AdoP(Parallel Hash),并行度2
3) Worker0/worker1 TOP排序内存95MB
4) a/b表读取大量buffers以及几十w rows
5) 基于字段fbizdate查询跨1年查询,fbizdate无索引
6) 表/索引膨胀
7) 统计信息缺失
优化思路:
1) 控制时间字段fbizdate查询跨度,1~3个月
2) 创建基于时间字段fbizedate索引
3) 增加work_mem大小(128MB->256MB)
4) 增加 shared_buffers大小 (1/4-1/2 OS mem)
5) random_page_cost设置为(4->1.1)
6) max_parallel_workers_per_gather(2->4)
7) max_parallel_maintenance_workers (2->4)
8) pg_repack/vacuum full analyze空间回收及统计信息收集更新