在继续阅读这篇文章之前,请确保阅读了关于PostgreSQL中TOASTing策略的最后一篇文章,至少您熟悉前面的主题。我们研究的策略是一个重要的实现细节,这在TOASTing中需要注意,直到我们知道TOASTing实际上是如何工作的,以及当数据从主表中移出时,数据将流向何处。
让我们从一张小表开始:
postgres=# create table t ( a int, b text );
CREATE TABLE
postgres=# select attname, atttypid::regtype,attcompression,
case attstorage when 'p' then 'plain'
when 'e' then 'external'
when 'm' then 'main'
when 'x' then 'extended'
end AS strategy
from pg_attribute
where attrelid = 't'::regclass and attnum > 0;
attname | atttypid | attcompression | strategy
---------+----------+----------------+----------
a | integer | | plain
b | text | | extended
(2 rows)
我们已经知道列“a”可能会被压缩,列“b”可能会压缩并移出行。PostgreSQL,因为列“b”可能包含必须移出主表的数据,所以已经在后台创建了一个TOAST表。如果你不知道这一点,你可能永远不会注意到,因为它是透明发生的。问题是,该表是在哪里创建的,您如何查看它?它肯定不是在与表相同的模式中创建的,因为我们看不到它:
postgres=# \d
List of relations
Schema | Name | Type | Owner
--------+------+-------+----------
public | t | table | postgres
(1 row)
TOAST表进入一个名为“pg_TOAST”的特殊表中,您会注意到已经有很多其他TOAST表格和索引:
postgres=# \dnS
List of schemas
Name | Owner
--------------------+-------------------
information_schema | postgres
pg_catalog | postgres
pg_toast | postgres
public | pg_database_owner
(4 rows)
postgres=# \d pg_toast.pg_toast_
pg_toast.pg_toast_1213 pg_toast.pg_toast_1418_index pg_toast.pg_toast_2619 pg_toast.pg_toast_3456_index
pg_toast.pg_toast_1213_index pg_toast.pg_toast_16418 pg_toast.pg_toast_2619_index pg_toast.pg_toast_3466
pg_toast.pg_toast_1247 pg_toast.pg_toast_16418_index pg_toast.pg_toast_2620 pg_toast.pg_toast_3466_index
pg_toast.pg_toast_1247_index pg_toast.pg_toast_2328 pg_toast.pg_toast_2620_index pg_toast.pg_toast_3592
pg_toast.pg_toast_1255 pg_toast.pg_toast_2328_index pg_toast.pg_toast_2964 pg_toast.pg_toast_3592_index
pg_toast.pg_toast_1255_index pg_toast.pg_toast_2396 pg_toast.pg_toast_2964_index pg_toast.pg_toast_3596
pg_toast.pg_toast_1260 pg_toast.pg_toast_2396_index pg_toast.pg_toast_3079 pg_toast.pg_toast_3596_index
pg_toast.pg_toast_1260_index pg_toast.pg_toast_2600 pg_toast.pg_toast_3079_index pg_toast.pg_toast_3600
pg_toast.pg_toast_1262 pg_toast.pg_toast_2600_index pg_toast.pg_toast_3118 pg_toast.pg_toast_3600_index
pg_toast.pg_toast_1262_index pg_toast.pg_toast_2604 pg_toast.pg_toast_3118_index pg_toast.pg_toast_6000
pg_toast.pg_toast_13378 pg_toast.pg_toast_2604_index pg_toast.pg_toast_3256 pg_toast.pg_toast_6000_index
pg_toast.pg_toast_13378_index pg_toast.pg_toast_2606 pg_toast.pg_toast_3256_index pg_toast.pg_toast_6100
pg_toast.pg_toast_13383 pg_toast.pg_toast_2606_index pg_toast.pg_toast_3350 pg_toast.pg_toast_6100_index
pg_toast.pg_toast_13383_index pg_toast.pg_toast_2609 pg_toast.pg_toast_3350_index pg_toast.pg_toast_6106
pg_toast.pg_toast_13388 pg_toast.pg_toast_2609_index pg_toast.pg_toast_3381 pg_toast.pg_toast_6106_index
pg_toast.pg_toast_13388_index pg_toast.pg_toast_2612 pg_toast.pg_toast_3381_index pg_toast.pg_toast_6243
pg_toast.pg_toast_13393 pg_toast.pg_toast_2612_index pg_toast.pg_toast_3394 pg_toast.pg_toast_6243_index
pg_toast.pg_toast_13393_index pg_toast.pg_toast_2615 pg_toast.pg_toast_3394_index pg_toast.pg_toast_826
pg_toast.pg_toast_1417 pg_toast.pg_toast_2615_index pg_toast.pg_toast_3429 pg_toast.pg_toast_826_index
如果满足以下条件,则所有名称都命名为“pg_toast_”,后跟toast表的OID:
postgres=# select oid from pg_class where relname = 't';
oid
-------
16418
(1 row)
postgres=# \d pg_toast.pg_toast_16418
TOAST table "pg_toast.pg_toast_16418"
Column | Type
------------+---------
chunk_id | oid
chunk_seq | integer
chunk_data | bytea
Owning table: "public.t"
Indexes:
"pg_toast_16418_index" PRIMARY KEY, btree (chunk_id, chunk_seq)
使用这种方法来标识TOAST表时要小心,表的OID可能会更改。识别TOAST表的正确方法是询问pg_class:
postgres=# select relname,reltoastrelid::regclass
from pg_class
where relname = 't'; \gset
relname | reltoastrelid
---------+-------------------------
t | pg_toast.pg_toast_16418
(1 row)
回到TOAST表的结构:
postgres=# \d :reltoastrelid
TOAST table "pg_toast.pg_toast_16418"
Column | Type
------------+---------
chunk_id | oid
chunk_seq | integer
chunk_data | bytea
Owning table: "public.t"
Indexes:
"pg_toast_16418_index" PRIMARY KEY, btree (chunk_id, chunk_seq)
PostgreSQL总是使用主键(chunk_id,chunk_seq)来访问数据。更准确地说,主键用于检索以二进制格式存储在字节列中的数据“块”,主表包含指向chunk_id的指针。这绝对有道理,因为数据要么在移出行之前被压缩,要么数据已经是二进制格式。PostgreSQL处理大数据的方法是,将数据切成更小的“块”,然后这些块进入TOAST表。这样可以使主表保持较小,并且只要您不要求TOAST数据,就不需要转到TOAST表。这应该已经告诉您,对于包含TOASTed数据的表,您应该避免使用“select*”,甚至更好的是,完全避免使用“select*”。
今天就到这里。在下一篇文章中,我们将详细介绍TOASTing的内部工作原理。
原文标题:TOASTing in PostgreSQL, TOAST tables
原文作者:Daniel Westermann
原文链接:https://www.dbi-services.com/blog/toasting-in-postgresql-toast-tables/




