openGauss每日一练第21天｜学习openGauss存储模型-行存和列存

原创赵敬星 2021-12-21

1068

坚持学习openGauss数据库，坚持每天打卡。第二十一天学习openGauss存储模型-行存和列存。

连接openGauss

root@modb:~# su - omm
omm@modb:~$ gsql -r
gsql ((opengauss 2.0.0 build 78689da9) compiled at 2021-03-31 21:03:52 commit 0 last mr  )
non-ssl connection (ssl connection is recommended when requiring high-security)
type "help" for help.
omm=#

1.创建行存表和列存表，并批量插入10万条数据(行存表和列存表数据相同)

--创建行存表
omm=# create schema schema1;
CREATE SCHEMA
omm=# CREATE TABLE schema1.tab1
omm-# (
omm(# id CHAR(2),
omm(# name VARCHAR2(40),
omm(# num NUMBER
omm(# );
omm=# CREATE TABLE
omm=# insert into schema1.tab1 select id, name, num from (select generate_series(1, 100000) as key, repeat(chr(int4(random() * 26) + 65), 2) as  id, repeat(chr(int4(random() * 26) + 65), 30) as  name, (random() * (10^4))::integer as num);
INSERT 0 100000
--创建列存表
omm=# CREATE TABLE schema1.tab2
omm-# (
omm(# id CHAR(2),
omm(# name VARCHAR2(40),
omm(# num NUMBER
omm(# )
omm-# WITH (ORIENTATION = COLUMN);
CREATE TABLE
omm=# insert into schema1.tab2 select * from schema1.tab1;
INSERT 0 100000

2.对比行存表和列存表空间大小

omm=# show search_path ;
  search_path   
----------------
 "$user",public
(1 row)

omm=# set search_path to "$user",public,schema1;
SET
omm=# \d+
                                        List of relations
 Schema  | Name | Type  | Owner |    Size    |               Storage                | Description 
---------+------+-------+-------+------------+--------------------------------------+-------------
 public  | tab1 | table | omm   | 8192 bytes | {orientation=row,compression=no}     | 
 schema1 | tab2 | table | omm   | 1112 kB    | {orientation=column,compression=low} | 
(2 rows)

3.对比查询一列和插入一行的速度

omm=# analyze VERBOSE schema1.tab1;
INFO:  analyzing "schema1.tab1"(gaussdb pid=1)
INFO:  ANALYZE INFO : "tab1": scanned 841 of 841 pages, containing 100000 live rows and 0 dead rows; 30000 rows in sample, 100000 estimated total rows(gaussdb pid=1)
ANALYZE
Time: 86.836 ms
omm=# analyze VERBOSE schema1.tab2;
INFO:  analyzing "schema1.tab2"(gaussdb pid=1)
INFO:  ANALYZE INFO : estimate total rows of "pg_delta_16501": scanned 0 pages of total 0 pages with 1 retry times, containing 0 live rows and 0 dead rows,  estimated 0 total rows(gaussdb pid=1)
INFO:  ANALYZE INFO : "tab2": scanned 2 of 2 cus, sample 30000 rows, estimated total 100000 rows(gaussdb pid=1)
ANALYZE
Time: 93.486 ms
omm=# explain analyze select distinct id from schema1.tab1;
                                                    QUERY PLAN                                                    
------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=2091.00..2091.27 rows=27 width=3) (actual time=52.099..52.102 rows=27 loops=1)
   Group By Key: id
   ->  Seq Scan on tab1  (cost=0.00..1841.00 rows=100000 width=3) (actual time=0.016..25.210 rows=100000 loops=1)
 Total runtime: 52.190 ms
(4 rows)

Time: 53.106 ms
omm=# explain analyze select distinct id from schema1.tab2;
                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
 Row Adapter  (cost=1008.27..1008.27 rows=27 width=3) (actual time=4.214..4.219 rows=27 loops=1)
   ->  Vector Sonic Hash Aggregate  (cost=1008.00..1008.27 rows=27 width=3) (actual time=4.212..4.213 rows=27 loops=1)
         Group By Key: id
         ->  CStore Scan on tab2  (cost=0.00..758.00 rows=100000 width=3) (actual time=0.035..0.301 rows=100000 loops=1)
 Total runtime: 4.313 ms
(5 rows)

Time: 4.964 ms
omm=# explain analyze insert into schema1.tab1 values(1, 'z', 123);
                                         QUERY PLAN                                         
--------------------------------------------------------------------------------------------
 [Bypass]
 Insert on tab1  (cost=0.00..0.01 rows=1 width=0) (actual time=0.069..0.070 rows=1 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
 Total runtime: 0.170 ms
(4 rows)

Time: 2.672 ms
omm=# explain analyze insert into schema1.tab2 values(1, 'z', 123);
                                         QUERY PLAN                                         
--------------------------------------------------------------------------------------------
 Insert on tab2  (cost=0.00..0.01 rows=1 width=0) (actual time=3.685..3.687 rows=1 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
 Total runtime: 3.777 ms
(3 rows)

Time: 5.039 ms

4.清理数据

omm=# drop schema schema1 cascade;
NOTICE:  drop cascades to 2 other objects
DETAIL:  drop cascades to table schema1.tab1
drop cascades to table schema1.tab2
DROP SCHEMA
Time: 5.916 ms

通过学习openGauss存储模型-行存和列存，了解到行存储是指将表按行存储到硬盘分区上，列存储是指将表按列存储到硬盘分区上。默认情况下，创建的表为行存储。
行、列存储模型各有优劣，通常用于TP场景的数据库，默认使用行存储，仅对执行复杂查询且数据量大的AP场景时，才使用列存储。

opengauss

最后修改时间：2021-12-21 10:53:17

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

文章被以下合辑收录

信创数据库（共24篇）

信创数据库