暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL 实时健康监控 大屏 - 低频指标 - 珍藏级

digoal 2018-06-13
557

作者

digoal

日期

2018-06-13

标签

PostgreSQL , 大屏指标 , qps , long query , locks , active , idle in transaction , long idle in transaction , 2PC


背景

最关键的一些数据库健康指标,趋势监测。

1 数据年龄

年龄不能超过20亿(大概值),建议当达到15亿时,应尽快安排freeze。

postgres=# select datname,age(datfrozenxid) from pg_database order by 2 desc; datname | age -----------+----------- postgres | 172127964 template1 | 172127964 template0 | 172127964 test | 172127964 (4 rows)

2 年龄大于N的对象占用空间数

年龄大于12亿的数据库,占用的空间数

```
postgres=# select pg_size_pretty(sum(pg_database_size(oid))) from pg_database where age(datfrozenxid)>1200000000;
pg_size_pretty


(1 row)
```

3 freeze 风暴预测

《PostgreSQL Freeze 风暴预测续 - 珍藏级SQL》

4 表膨胀

https://raw.githubusercontent.com/digoal/pgsql_admin_script/master/generate_report.sh

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'` do psql -d $db --pset=pager=off -q -x -c 'SELECT current_database() AS db, schemaname, tablename, reltuples::bigint AS tups, relpages::bigint AS pages, otta, ROUND(CASE WHEN otta=0 OR sml.relpages=0 OR sml.relpages=otta THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat, CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages, CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes, CASE WHEN relpages < otta THEN $$0 bytes$$::text ELSE (bs*(relpages-otta))::bigint || $$ bytes$$ END AS wastedsize, iname, ituples::bigint AS itups, ipages::bigint AS ipages, iotta, ROUND(CASE WHEN iotta=0 OR ipages=0 OR ipages=iotta THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat, CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta END AS wastedipages, CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes, CASE WHEN ipages < iotta THEN $$0 bytes$$ ELSE (bs*(ipages-iotta))::bigint || $$ bytes$$ END AS wastedisize, CASE WHEN relpages < otta THEN CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta::bigint) END ELSE CASE WHEN ipages < iotta THEN bs*(relpages-otta::bigint) ELSE bs*(relpages-otta::bigint + ipages-iotta::bigint) END END AS totalwastedbytes FROM ( SELECT nn.nspname AS schemaname, cc.relname AS tablename, COALESCE(cc.reltuples,0) AS reltuples, COALESCE(cc.relpages,0) AS relpages, COALESCE(bs,0) AS bs, COALESCE(CEIL((cc.reltuples*((datahdr+ma- (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)),0) AS otta, COALESCE(c2.relname,$$?$$) AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages, COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols FROM pg_class cc JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname <> $$information_schema$$ LEFT JOIN ( SELECT ma,bs,foo.nspname,foo.relname, (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr, (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2 FROM ( SELECT ns.nspname, tbl.relname, hdr, ma, bs, SUM((1-coalesce(null_frac,0))*coalesce(avg_width, 2048)) AS datawidth, MAX(coalesce(null_frac,0)) AS maxfracsum, hdr+( SELECT 1+count(*)/8 FROM pg_stats s2 WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND s2.tablename = tbl.relname ) AS nullhdr FROM pg_attribute att JOIN pg_class tbl ON att.attrelid = tbl.oid JOIN pg_namespace ns ON ns.oid = tbl.relnamespace LEFT JOIN pg_stats s ON s.schemaname=ns.nspname AND s.tablename = tbl.relname AND s.inherited=false AND s.attname=att.attname, ( SELECT (SELECT current_setting($$block_size$$)::numeric) AS bs, CASE WHEN SUBSTRING(SPLIT_PART(v, $$ $$, 2) FROM $$#"[0-9]+.[0-9]+#"%$$ for $$#$$) IN ($$8.0$$,$$8.1$$,$$8.2$$) THEN 27 ELSE 23 END AS hdr, CASE WHEN v ~ $$mingw32$$ OR v ~ $$64-bit$$ THEN 8 ELSE 4 END AS ma FROM (SELECT version() AS v) AS foo ) AS constants WHERE att.attnum > 0 AND tbl.relkind=$$r$$ GROUP BY 1,2,3,4,5 ) AS foo ) AS rs ON cc.relname = rs.relname AND nn.nspname = rs.nspname LEFT JOIN pg_index i ON indrelid = cc.oid LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid ) AS sml order by wastedbytes desc limit 5' done

5 索引膨胀

https://raw.githubusercontent.com/digoal/pgsql_admin_script/master/generate_report.sh

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'` do psql -d $db --pset=pager=off -q -x -c 'SELECT current_database() AS db, schemaname, tablename, reltuples::bigint AS tups, relpages::bigint AS pages, otta, ROUND(CASE WHEN otta=0 OR sml.relpages=0 OR sml.relpages=otta THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat, CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages, CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes, CASE WHEN relpages < otta THEN $$0 bytes$$::text ELSE (bs*(relpages-otta))::bigint || $$ bytes$$ END AS wastedsize, iname, ituples::bigint AS itups, ipages::bigint AS ipages, iotta, ROUND(CASE WHEN iotta=0 OR ipages=0 OR ipages=iotta THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat, CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta END AS wastedipages, CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes, CASE WHEN ipages < iotta THEN $$0 bytes$$ ELSE (bs*(ipages-iotta))::bigint || $$ bytes$$ END AS wastedisize, CASE WHEN relpages < otta THEN CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta::bigint) END ELSE CASE WHEN ipages < iotta THEN bs*(relpages-otta::bigint) ELSE bs*(relpages-otta::bigint + ipages-iotta::bigint) END END AS totalwastedbytes FROM ( SELECT nn.nspname AS schemaname, cc.relname AS tablename, COALESCE(cc.reltuples,0) AS reltuples, COALESCE(cc.relpages,0) AS relpages, COALESCE(bs,0) AS bs, COALESCE(CEIL((cc.reltuples*((datahdr+ma- (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)),0) AS otta, COALESCE(c2.relname,$$?$$) AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages, COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols FROM pg_class cc JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname <> $$information_schema$$ LEFT JOIN ( SELECT ma,bs,foo.nspname,foo.relname, (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr, (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2 FROM ( SELECT ns.nspname, tbl.relname, hdr, ma, bs, SUM((1-coalesce(null_frac,0))*coalesce(avg_width, 2048)) AS datawidth, MAX(coalesce(null_frac,0)) AS maxfracsum, hdr+( SELECT 1+count(*)/8 FROM pg_stats s2 WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND s2.tablename = tbl.relname ) AS nullhdr FROM pg_attribute att JOIN pg_class tbl ON att.attrelid = tbl.oid JOIN pg_namespace ns ON ns.oid = tbl.relnamespace LEFT JOIN pg_stats s ON s.schemaname=ns.nspname AND s.tablename = tbl.relname AND s.inherited=false AND s.attname=att.attname, ( SELECT (SELECT current_setting($$block_size$$)::numeric) AS bs, CASE WHEN SUBSTRING(SPLIT_PART(v, $$ $$, 2) FROM $$#"[0-9]+.[0-9]+#"%$$ for $$#$$) IN ($$8.0$$,$$8.1$$,$$8.2$$) THEN 27 ELSE 23 END AS hdr, CASE WHEN v ~ $$mingw32$$ OR v ~ $$64-bit$$ THEN 8 ELSE 4 END AS ma FROM (SELECT version() AS v) AS foo ) AS constants WHERE att.attnum > 0 AND tbl.relkind=$$r$$ GROUP BY 1,2,3,4,5 ) AS foo ) AS rs ON cc.relname = rs.relname AND nn.nspname = rs.nspname LEFT JOIN pg_index i ON indrelid = cc.oid LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid ) AS sml order by wastedibytes desc limit 5' done

6 TOP SQL

《PostgreSQL 如何查找TOP SQL (例如IO消耗最高的SQL) - 珍藏级》

7 未使用索引

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'` do psql -d $db --pset=pager=off -q -x -c ' select current_database(),* from pg_stat_all_indexes where idx_scan=0 or idx_tup_read=0 or idx_tup_fetch=0; ' done

8 未使用(查询)表

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'` do psql -d $db --pset=pager=off -q -x -c ' select current_database(),* from pg_stat_all_tables where seq_scan=0 and idx_scan=0; ' done

9 热表

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'` do psql -d $db --pset=pager=off -q -x -c ' select current_database(),* from pg_stat_all_tables order by seq_scan+idx_scan desc limit 10; ' done

10 冷表

针对性删除表,或者使用OSS冷存储。

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'` do psql -d $db --pset=pager=off -q -x -c ' select current_database(),* from pg_stat_all_tables order by seq_scan+idx_scan limit 10; ' done

11 热索引

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'` do psql -d $db --pset=pager=off -q -x -c ' select current_database(),* from pg_stat_all_indexes order by idx_scan desc limit 10; ' done

12 冷索引

针对性删除索引。

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'` do psql -d $db --pset=pager=off -q -x -c ' select current_database(),* from pg_stat_all_indexes order by idx_scan limit 10; ' done

13 全表扫描次数TOP对象

针对性创建索引。

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'` do psql -d $db --pset=pager=off -q -x -c ' select current_database(),* from pg_stat_all_tables order by seq_scan desc limit 10; ' done

14 全表扫描返回记录数TOP对象

针对性创建索引。

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'` do psql -d $db --pset=pager=off -q -x -c ' select current_database(),* from pg_stat_all_tables order by seq_tup_read desc limit 10; ' done

15 监控统计信息不准的对象(表、物化视图)

背景知识

1、建立测试表

postgres=# create table a(id int); CREATE TABLE

2、禁止收集统计信息

postgres=# alter table a set (autovacuum_enabled =off); ALTER TABLE

3、写入大量数据

postgres=# insert into a select generate_series(1,10000000); INSERT 0 10000000

4、查看统计信息项,目前占用多少空间(pages)

```
postgres=# select relpages from pg_class where relname='a';
relpages


    0

(1 row)
```

5、查看真实空间占用

```
postgres=# select pg_relation_size('a');
pg_relation_size


    362479616

(1 row)
```

6、真实空间占用,转换为PAGES

```
postgres=# select pg_relation_size('a')/current_setting('block_size')::float8;
?column?


44248

(1 row)
```

7、收集统计信息,查看统计信息项,目前占用多少空间(pages)

```
postgres=# analyze a;
ANALYZE

postgres=# select relpages from pg_class where relname='a';
relpages


44248

(1 row)
```

8、根据以上原理,可以设计评估统计信息不准确的表 SQL如下:

```
select oid::regclass as table, relpages, pg_relation_size(oid)/current_setting('block_size')::float8 as real_pages from pg_class
where relkind in ('r', 'm') -- 表和物化视图
and pg_relation_size(oid) > 1048576 -- 大于1MB
and (pg_relation_size(oid)/current_setting('block_size')::float8 - relpages)/(pg_relation_size(oid)/current_setting('block_size')::float8) > 0.2; -- 大于 20% 偏差

table | relpages | real_pages
-------+----------+------------
a | 0 | 44248
(1 row)
```

内核增加功能

1、读写磁盘吞吐快照区间统计,区分索引,表,垃圾回收,FREEZE,AUTOANALYZE。分类统计。

2、锁等待时长快照区间统计,区分锁粒度,下钻到对象。

注意

stat的信息在使用pg_stat_reset()后会清零。请注意。

参考

https://raw.githubusercontent.com/digoal/pgsql_admin_script/master/generate_report.sh

《PostgreSQL Freeze 风暴预测续 - 珍藏级SQL》

PostgreSQL 许愿链接

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

digoal's wechat

文章转载自digoal,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论