暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Vertica使用partitions表对分区大小进行汇总出现差异的原因

原创 simonchiang 2021-12-15
1079

Vertica使用partitions表对分区大小进行汇总出现差异的原因

使用partitions表对分区表的分区表数量进行汇总,部分分区表汇总结果和projection_storage结果差异巨大

table_name|acc_sum_MB| p_sum_MB |a ----------+----------+----------+---------------- s1.T1 |119,158,775 |1,768,641,038 |-1,649,482,263 s1.T2 |105,580,146 |105,580,146 |0 s1.T3 |43,350,583 |43,350,583 |0 s1.T4 |21,088 |5,032,638 |-5,011,550

原因在于partitions中的ros_id指定的ros中可能存放有多个分区的数据,在按分区汇总分析的过程中重复计算了大量的ros大小。

  • 正常情况下每个分区的数据一定是存放在不同的ROS中,但如果对该表使用mergeoutsingleros函数合并过文件,则会出现不同分区数据在一个ros的情况。

下面这个表既是如此:

dbadmin=> select ros_id,count(distinct partition_key) from partitions where table_schema = ‘s1’ and projection_name ilike ‘T1_b%’ group by 1 order by 2 desc limit 10 dbadmin-> ; ros_id | count --------------------±------ 45035998330997143 | 124 49539597958367061 | 124 54043197585737455 | 124 58546797213107695 | 124 63050396840478331 | 124 76561195722589955 | 124 81064795349960353 | 124 90071994604701407 | 124 99079193859442525 | 124 108086393114183323 | 124 (10 rows)

这个表的一个ros_id只有一个分区的数据,所以汇总出来的结果和projection_storage数据一致

dbadmin=> select ros_id,count(distinct partition_key) from partitions where table_schema = ‘s1’ and projection_name ilike ‘T1_b%’ group by 1 order by 2 desc limit 10; ros_id | count -------------------±------ 49539597991809079 | 1 72057596095218277 | 1 72057596206148613 | 1 81064795614503925 | 1 85568395111826421 | 1 90071994604700539 | 1 94575594232070745 | 1 94575594232070765 | 1 94575594232071131 | 1 94575594232071157 | 1 (10 rows)

可以通过 alter table … reorganize; 分开再汇总。

alter table s1.t1 reorganize; NOTICE 4785: Started background repartition table task ALTER TABLE dbadmin=> dbadmin=> select ros_id,count(distinct partition_key) from partitions where table_schema = ‘s1’ and projection_name ilike ‘T1_b%’ group by 1 order by 2 desc limit 10; ros_id | count -------------------±------ 45035998686463575 | 240 58546797569190371 | 240 63050397177898613 | 240 67553996804240455 | 240 81064795686208547 | 240 81064795686208721 | 240 90071994946351151 | 240 90071994946351323 | 240 94575594570276897 | 240 94575594570277079 | 240 (10 rows) dbadmin=> select ros_id,count(distinct partition_key) from partitions where table_schema = ‘s1’ and projection_name ilike ‘T1_b%’ group by 1 order by 2 desc limit 10; ros_id | count -------------------±------ 45035998711548941 | 1 54043197942002701 | 1 58546797589367811 | 1 58546797595422147 | 1 63050397202424841 | 1 67553996828519431 | 1 67553996828549109 | 1 72057596451331077 | 1 76561196059593735 | 1 76561196079090683 | 1 (10 rows) select foo.table_name,foo.sum_MB as acc_sum_MB,lish.sum_MB as p_sum_MB from (select ps.anchor_table_schema||’.’||ps.anchor_table_name table_name,sum(ps.used_bytes)//1024^2 sum_MB from projection_storage ps group by 1) foo left join (SELECT prj.projection_schema||’.’||prj.anchor_table_name table_name, SUM(par.ros_size_bytes)//1024^2 sum_MB FROM partitions par JOIN projections prj USING (projection_id) GROUP BY 1 ) lish on foo.table_name = lish.table_name WHERE foo.table_name = ‘s1.T1’ order by 2 desc; table_name | acc_sum_MB | p_sum_MB -------------------------------±-----------±--------- s1.T1 | 20739 | 20739 (1 row)

二维码.png

最后修改时间:2024-08-29 11:32:18
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论