作者
digoal
日期
2021-12-24
标签
PostgreSQL , 热门问题
- 问题说明(现象、环境)
- 分析原因
- 结论和解决办法
24、为什么与检索字段类型不一致的输入条件有时可能不能采用索引?
https://www.bilibili.com/video/BV1ju411m7J9/
例如:
- 被索引的字段是int4, 但是输入的where 条件是int8.
- 被索引的字段是int4, 但是输入的where 条件是text.
- 被索引的字段是numeric, 但是输入的where 条件是text.
- 被索引的字段是numeric, 但是输入的where 条件是float.
- 被索引的字段是float, 但是输入的where 条件是numeric.
例子:
create table a (id int4, c1 numeric, c2 timestamp, c3 text, c4 float); create index idx_a_1 on a (id); create index idx_a_2 on a (c1); create index idx_a_3 on a (c2); create index idx_a_4 on a (c3); create index idx_a_5 on a (c4); insert into a select generate_series(1,1000000), random(), clock_timestamp(), (random()*100)::int, random(); analyze a;
复制
-- 变量text类型可能超过id(int)的边界, 系统没有做这个默认的类型转换, 没有把text默认转换为int -- 有必要我们可以设置这样的默认转换 explain select * from a where id='1'::text; ERROR: operator does not exist: integer = text LINE 1: explain select * from a where id='1'::text; ^ HINT: No operator matches the given name and argument types. You might need to add explicit type casts. -- 变量被识别为int4 postgres=# explain select * from a where id='1'; QUERY PLAN ------------------------------------------------------------------ Index Scan using idx_a_1 on a (cost=0.42..2.64 rows=1 width=33) Index Cond: (id = 1) (2 rows) -- 用到了跨类型的OP postgres=# explain select * from a where id='1'::int8; QUERY PLAN ------------------------------------------------------------------ Index Scan using idx_a_1 on a (cost=0.42..2.64 rows=1 width=33) Index Cond: (id = '1'::bigint) (2 rows) -- 变量int8被隐式转换为numeric postgres=# explain select * from a where c1='1'::int8; QUERY PLAN ------------------------------------------------------------------ Index Scan using idx_a_2 on a (cost=0.42..2.64 rows=1 width=33) Index Cond: (c1 = '1'::numeric) (2 rows) -- 变量text类型可能超过c1(numeric)的边界, 系统没有做这个默认的类型转换, 没有把text默认转换为numeric -- 有必要我们可以设置这样的默认转换 postgres=# explain select * from a where c1='1'::text; ERROR: operator does not exist: numeric = text LINE 1: explain select * from a where c1='1'::text; ^ HINT: No operator matches the given name and argument types. You might need to add explicit type casts. -- 变量float和numeric都被转换为float8 -- 由于索引对应字段的类型发生了转换, 所以一定用不了索引. 因为索引里存储的是numeric的binary value. -- 可以参考: [《每天5分钟,PG聊通透 - 系列1 - 热门问题 - 链接、驱动、SQL - 第23期 - 为什么有的函数不能被用来创建表达式索引?》](https://github.com/digoal/blog202112/20211224_04.md) postgres=# explain select * from a where c1='1'::float; QUERY PLAN ------------------------------------------------------------ Seq Scan on a (cost=0.00..23354.00 rows=5000 width=33) Filter: ((c1)::double precision = '1'::double precision) (2 rows) -- 除非有一个表达式索引(里面存储的是double precision, 那么就能用到索引). postgres=# create index idx_a_6 on a ((c1::float8)); CREATE INDEX postgres=# explain select * from a where c1='1'::float; QUERY PLAN ------------------------------------------------------------------------ Index Scan using idx_a_6 on a (cost=0.42..4337.23 rows=5000 width=33) Index Cond: ((c1)::double precision = '1'::double precision) (2 rows) -- numeric虽然是float8的超集, 但是系统做了隐式转换 postgres=# explain select * from a where c4='1'::numeric; QUERY PLAN ------------------------------------------------------------------ Index Scan using idx_a_5 on a (cost=0.42..2.64 rows=1 width=33) Index Cond: (c4 = '1'::double precision) (2 rows)
复制
是否使用索引:
1、列出所有的同名op, 有没有op直接支持这两种类型?
- 有就跳到3, 去判断索引的ops里面有没有包含这个op.
- 没有就跳到2, 去寻求隐式转换.
2、在这些op中, 找到与字段类型一致的op, 并且尝试将变量类型转换为op里面对应的类型. (要求存在隐式转换).
例如1 select * from a where id='1'::int8;
要的是 =(numeric,int8)
找到所有的 =(numeric,*)
pg_catalog | = | numeric | numeric | boolean | equal
复制
只有一个操作符符合条件 =(numeric,numeric)
接下来看看有没有可以将int8隐式转换为numeric的cast?
postgres=# \dC+ numeric List of casts Source type | Target type | Function | Implicit? | Description ------------------+------------------+----------+---------------+------------- bigint | numeric | numeric | yes |
复制
找到了. 进入转换
=(numeric,numeric)
是否支持索引呢?
支持, 由于转换的是变量, c1没有被转换, 所以最后用到了索引.
例如2 select * from a where c1='1'::float;
要的是 =(numeric,float)
找到所有的 =(numeric,*)
pg_catalog | = | numeric | numeric | boolean | equal
复制
只有一个操作符符合条件 =(numeric,numeric)
接下来看看有没有可以将float4隐式转换为numeric的cast?
postgres=# \dC+ float4 List of casts Source type | Target type | Function | Implicit? | Description ------------------+------------------+----------+---------------+------------- bigint | real | float4 | yes | double precision | real | float4 | in assignment | integer | real | float4 | yes | jsonb | real | float4 | no | numeric | real | float4 | yes | real | bigint | int8 | in assignment | real | double precision | float8 | yes | real | integer | int4 | in assignment | real | numeric | numeric | in assignment | real | smallint | int2 | in assignment | smallint | real | float4 | yes | (11 rows)
复制
没有, 但是发现float4可以隐式转换为double precision. 所以查找有没有 =(*,double precision)
发现
pg_catalog | = | double precision | double precision | boolean | equal pg_catalog | = | real | double precision | boolean | equal
复制
所以, 查找字段类型numeric是否能隐式转换为double precision或者real?
numeric | real | float4 | yes | numeric | double precision | float8 | yes |
复制
找到了, 所以最后选择了其中之一的 =(double precision,double precision)
由于c1被转换, 所以最后不能使用索引.
3、创建索引时, 使用的ops是哪个? 这个ops支持哪些操作符(op), 如果在这些op里面有op被匹配到了这两种类型, 就可以使用这个索引. 如果没有则使用seq scan.
例如gist__int_ops
这个ops
支持哪些op和func?
CREATE OPERATOR CLASS gist__int_ops DEFAULT FOR TYPE _int4 USING gist AS OPERATOR 3 &&, OPERATOR 6 = (anyarray, anyarray), OPERATOR 7 @>, OPERATOR 8 <@, OPERATOR 20 @@ (_int4, query_int), FUNCTION 1 g_int_consistent (internal, _int4, smallint, oid, internal), FUNCTION 2 g_int_union (internal, internal), FUNCTION 3 g_int_compress (internal), FUNCTION 4 g_int_decompress (internal), FUNCTION 5 g_int_penalty (internal, internal, internal), FUNCTION 6 g_int_picksplit (internal, internal), FUNCTION 7 g_int_same (_int4, _int4, internal);
复制
https://www.postgresql.org/docs/14/sql-createopclass.html
https://www.postgresql.org/docs/14/xindex.html#XINDEX-OPCLASS-DEPENDENCIES
strategy_number The index method's strategy number for an operator associated with the operator class. support_number The index method's support function number for a function associated with the operator class.
复制
除了人为的将变量类型转换为与字段一致的类型, 还可以:
- 修改(扩展)索引的ops定义, 增加ops跨类型的operator支持.
- 增加变量类型到目标类型(能用到索引的目标类型都可以)的隐式转换cast
https://www.postgresql.org/docs/14/xindex.html#XINDEX-OPCLASS-DEPENDENCIES
ALTER OPERATOR FAMILY integer_ops USING btree ADD -- cross-type comparisons int8 vs int2 OPERATOR 1 < (int8, int2) , OPERATOR 2 <= (int8, int2) , OPERATOR 3 = (int8, int2) , OPERATOR 4 >= (int8, int2) , OPERATOR 5 > (int8, int2) , FUNCTION 1 btint82cmp(int8, int2) , -- cross-type comparisons int8 vs int4 OPERATOR 1 < (int8, int4) , OPERATOR 2 <= (int8, int4) , OPERATOR 3 = (int8, int4) , OPERATOR 4 >= (int8, int4) , OPERATOR 5 > (int8, int4) , FUNCTION 1 btint84cmp(int8, int4) , -- cross-type comparisons int4 vs int2 OPERATOR 1 < (int4, int2) , OPERATOR 2 <= (int4, int2) , OPERATOR 3 = (int4, int2) , OPERATOR 4 >= (int4, int2) , OPERATOR 5 > (int4, int2) , FUNCTION 1 btint42cmp(int4, int2) , -- cross-type comparisons int4 vs int8 OPERATOR 1 < (int4, int8) , OPERATOR 2 <= (int4, int8) , OPERATOR 3 = (int4, int8) , OPERATOR 4 >= (int4, int8) , OPERATOR 5 > (int4, int8) , FUNCTION 1 btint48cmp(int4, int8) , -- cross-type comparisons int2 vs int8 OPERATOR 1 < (int2, int8) , OPERATOR 2 <= (int2, int8) , OPERATOR 3 = (int2, int8) , OPERATOR 4 >= (int2, int8) , OPERATOR 5 > (int2, int8) , FUNCTION 1 btint28cmp(int2, int8) , -- cross-type comparisons int2 vs int4 OPERATOR 1 < (int2, int4) , OPERATOR 2 <= (int2, int4) , OPERATOR 3 = (int2, int4) , OPERATOR 4 >= (int2, int4) , OPERATOR 5 > (int2, int4) , FUNCTION 1 btint24cmp(int2, int4) , -- cross-type in_range functions FUNCTION 3 in_range(int4, int4, int8, boolean, boolean) , FUNCTION 3 in_range(int4, int4, int2, boolean, boolean) , FUNCTION 3 in_range(int2, int2, int8, boolean, boolean) , FUNCTION 3 in_range(int2, int2, int4, boolean, boolean) ;
复制
期望 PostgreSQL 增加什么功能?
PolarDB for PostgreSQL云原生分布式开源数据库
PostgreSQL 解决方案集合
德哥 / digoal's github - 公益是一辈子的事.