openGauss每日一练第21天

原创少年中国强56。 2021-12-21

608

学习目标
学习openGauss存储模型-行存和列存

行存储是指将表按行存储到硬盘分区上，列存储是指将表按列存储到硬盘分区上。默认情况下，创建的表为行存储。

行、列存储模型各有优劣，通常用于TP场景的数据库，默认使用行存储，仅对执行复杂查询且数据量大的AP场景时，才使用列存储

课程学习
连接数据库

#第一次进入等待15秒
#数据库启动中…
su - omm
gsql -r
1.创建行存表
CREATE TABLE test_t1
(
col1 CHAR(2),
col2 VARCHAR2(40),
col3 NUMBER
);
omm=# CREATE TABLE test_t1
omm-# (
omm(# col1 CHAR(2),
omm(# col2 VARCHAR2(40),
omm(# col3 NUMBER
omm(# );
CREATE TABLE
–压缩属性为no

omm=# insert into test_t1 select col1, col2, col3 from (select generate_series(1, 100000) as key, repeat(chr(int4(random() * 26) + 65), 2) as col1, repeat(chr(int4(random() * 26) + 65), 30) as col2, (random() * (10^4))::integer as col3);
omm=# INSERT 0 100000
CREATE TABLE test_t2
omm-# (
omm(# col1 CHAR(2),
col2 VARCHAR2(40),
omm(# col3 NUMBER
omm(# omm(# )
omm-# WITH (ORIENTATION = COLUMN);
CREATE TABLE
2.创建列存表
CREATE TABLE test_t2
(
col1 CHAR(2),
col2 VARCHAR2(40),
col3 NUMBER
)
WITH (ORIENTATION = COLUMN);

–压缩属性为low

insert into test_t2 select * from test_t1;

3.占用空间对比
\d+
omm=# \d+
List of relations
Schema | Name | Type | Owner | Size | Storage | Description
--------±--------±------±------±--------±-------------------------------------±------------
public | test_t1 | table | omm | 6760 kB | {orientation=row,compression=no} |
public | test_t2 | table | omm | 24 kB | {orientation=column,compression=low} |
(2 rows)

4.对比读取一列的速度
analyze VERBOSE test_t1;
analyze VERBOSE test_t2;
omm=# analyze VERBOSE test_t1;
INFO: analyzing “public.test_t1”(gaussdb pid=1)
INFO: ANALYZE INFO : “test_t1”: scanned 841 of 841 pages, containing 100000 live rows and 0 dead rows; 30000 rows in sample, 100000 estimated total rows(gaussdb pid=1)
ANALYZE
omm=# analyze VERBOSE test_t2;
INFO: analyzing “public.test_t2”(gaussdb pid=1)
INFO: ANALYZE INFO : estimate total rows of “pg_delta_16395”: scanned 0 pages of total 0 pages with 1 retry times, containing 0 live rows and 0 dead rows, estimated 0 total rows(gaussdb pid=1)
ANALYZE
–列存表时间少于行存表

explain analyze select distinct col1 from test_t1;
explain analyze select distinct col1 from test_t2;
omm=# explain analyze select distinct col1 from test_t1;
QUERY PLAN

HashAggregate (cost=2091.00…2091.27 rows=27 width=3) (actual time=62.368…62.372 rows=27 loops=1)
Group By Key: col1
-> Seq Scan on test_t1 (cost=0.00…1841.00 rows=100000 width=3) (actual time=0.027…35.450 rows=100000 loops=1)
Total runtime: 62.435 ms
(4 rows)

omm=# explain analyze select distinct col1 from test_t2;
QUERY PLAN

Row Adapter (cost=13.66…13.66 rows=200 width=12) (actual time=0.079…0.079 rows=0 loops=1)
-> Vector Sonic Hash Aggregate (cost=11.66…13.66 rows=200 width=12) (actual time=0.078…0.078 rows=0 loops=1)
Group By Key: col1
-> CStore Scan on test_t2 (cost=0.00…10.47 rows=475 width=12) (actual time=0.014…0.014 rows=0 loops=1)
Total runtime: 0.179 ms
(5 rows)

5.对比插入一行的速度
–行存表时间少于列存表

explain analyze insert into test_t1 values(‘x’, ‘xxxx’, ‘123’);
explain analyze insert into test_t2 values(‘x’, ‘xxxx’, ‘123’);
omm=# explain analyze insert into test_t1 values(‘x’, ‘xxxx’, ‘123’);
QUERY PLAN

[Bypass]
Insert on test_t1 (cost=0.00…0.01 rows=1 width=0) (actual time=0.059…0.060 rows=1 loops=1)
-> Result (cost=0.00…0.01 rows=1 width=0) (actual time=0.001…0.001 rows=1 loops=1)
Total runtime: 0.140 ms
(4 rows)

omm=# explain analyze insert into test_t2 values(‘x’, ‘xxxx’, ‘123’);
QUERY PLAN

Insert on test_t2 (cost=0.00…0.01 rows=1 width=0) (actual time=15.229…15.231 rows=1 loops=1)
-> Result (cost=0.00…0.01 rows=1 width=0) (actual time=0.001…0.001 rows=1 loops=1)
Total runtime: 15.315 ms
(3 rows)
6.清理数据
drop table test_t1;
drop table test_t2;
omm=# drop table test_t1;
DROP TABLE
omm=# drop table test_t2;
DROP TABLE
omm=#

课程作业
1.创建行存表和列存表，并批量插入10万条数据(行存表和列存表数据相同)
omm=# create table t1 (id1 int,id2 int);
CREATE TABLE
omm=# create table t2(in1 int ,in2 int) WITH (ORIENTATION = COLUMN);
CREATE TABLE
omm=# insert into t1 select id1,id2 from (select generate_series(1,100000) as id1,int4(200+random()) as id2);
INSERT 0 100000
omm=# insert into t2 select * from t1;
INSERT 0 100000

2.对比行存表和列存表空间大小
omm=# \d+
List of relations
Schema | Name | Type | Owner | Size | Storage | Description
--------±-----------±------±------±-----------±-------------------------------------±------------
public | customer_t | table | omm | 8192 bytes | {orientation=row,compression=no} |
public | t1 | table | omm | 3568 kB | {orientation=row,compression=no} |
public | t2 | table | omm | 328 kB | {orientation=column,compression=low} |
(3 rows)
3.对比查询一列和插入一行的速度
omm=# explain analyze select * from t1 where id1=10;
QUERY PLAN

Seq Scan on t1 (cost=0.00…1633.28 rows=476 width=8) (actual time=0.022…18.972 rows=1 loops=1)
Filter: (id1 = 10)
Rows Removed by Filter: 99999
Total runtime: 19.014 ms
(4 rows)

omm=# explain analyze select * from t2 where in1=10;
QUERY PLAN

Row Adapter (cost=17.52…17.52 rows=11 width=8) (actual time=0.698…1.232 rows=1 loops=1)
Filter: (in1 = 10)
Rows Removed by Filter: 99999
Total runtime: 1.404 ms
(5 rows)

-> CStore Scan on t2 (cost=0.00…17.52 rows=11 width=8) (actual time=0.696…1.229 rows=1 loops=1)

omm=# explain analyze insert into t1 values (1,2);
QUERY PLAN

[Bypass]
Insert on t1 (cost=0.00…0.01 rows=1 width=0) (actual time=0.062…0.063 rows=1 loops=1)
-> Result (cost=0.00…0.01 rows=1 width=0) (actual time=0.001…0.001 rows=1 loops=1)
Total runtime: 0.167 ms
(4 rows)
omm=# explain analyze insert into t2 values(10,20);
QUERY PLAN

Insert on t2 (cost=0.00…0.01 rows=1 width=0) (actual time=0.118…0.120 rows=1 loops=1)
-> Result (cost=0.00…0.01 rows=1 width=0) (actual time=0.001…0.001 rows=1 loops=1)
Total runtime: 0.242 ms
(3 rows)
4.清理数据
omm=# drop table t1;
DROP TABLE
omm=# drop table t2;
DROP TABLE

opengauss

最后修改时间：2022-01-04 14:58:19

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

openGauss每日一练第21天

explain analyze select distinct col1 from test_t1; explain analyze select distinct col1 from test_t2; omm=# explain analyze select distinct col1 from test_t1; QUERY PLAN

omm=# explain analyze select distinct col1 from test_t2; QUERY PLAN

explain analyze insert into test_t1 values(‘x’, ‘xxxx’, ‘123’); explain analyze insert into test_t2 values(‘x’, ‘xxxx’, ‘123’); omm=# explain analyze insert into test_t1 values(‘x’, ‘xxxx’, ‘123’); QUERY PLAN

omm=# explain analyze insert into test_t2 values(‘x’, ‘xxxx’, ‘123’); QUERY PLAN

omm=# explain analyze select * from t2 where in1=10; QUERY PLAN

omm=# explain analyze insert into t1 values (1,2); QUERY PLAN

[Bypass] Insert on t1 (cost=0.00…0.01 rows=1 width=0) (actual time=0.062…0.063 rows=1 loops=1) -> Result (cost=0.00…0.01 rows=1 width=0) (actual time=0.001…0.001 rows=1 loops=1) Total runtime: 0.167 ms (4 rows) omm=# explain analyze insert into t2 values(10,20); QUERY PLAN

评论