暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

OB 4.3.2 JSON 列类型和索引使用体验

164
MySQL 5.7.8 推出 JSON 类型,还有相应的 JSON 函数,MySQL 8.0.17 推出针对值数组的多值索引。这些功能非常实用,曾经是 OB 替换 MySQL 的一个障碍。OB 3.2.2 推出 JSON 类型以及 JSON 函数,4.3.2 推出多值索引。虽然晚了点,终究是到了。本文分享 OB 4.3.2 版本的 JSON 类型以及索引使用示例和经验。

OB 的 ORACLE 租户和 MySQL 租户都支持 JSON 类型,本文主要演示 MySQL 租户下的 JSON 使用。
首先创建一个含有 JSON 类型的表,示例参考 MySQL 官方文档中 JSON 示例表。
CREATE TABLE customers (
  id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
  modified DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  custinfo JSON
 );
 
 INSERT INTO customers VALUES
 (NULLNOW(), '{"user":"Jack","user_id":37,"zipcode":[94582,94536]}'),
 (NULLNOW(), '{"user":"Jill","user_id":22,"zipcode":[94568,94507,94582]}'),
 (NULLNOW(), '{"user":"Bob","user_id":31,"zipcode":[94477,94507]}'),
 (NULLNOW(), '{"user":"Mary","user_id":72,"zipcode":[94536]}'),
 (NULLNOW(), '{"user":"Ted","user_id":56,"zipcode":[94507,94582]}')
;

select * from customers;

JSON列示例
JSON 列不支持直接索引,如果想对 JSON 列部分字段做索引,可以新增一个虚拟列,然后针对这个虚拟列创建索引。
ALTER TABLE customers ADD v_user varchar(20GENERATED ALWAYS AS (json_unquote(json_extract (`custinfo`, _utf8mb4'$.user'))) virtual ;
ALTER TABLE customers ADD KEY idx_user(v_user);

虚拟列示例
再看看查询虚拟列的执行计划。
mysql> explain SELECT * FROM customers WHERE v_user='Bob';
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query Plan                                                                                                                                                                         |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ===============================================================                                                                                                                    |
| |ID|OPERATOR        |NAME               |EST.ROWS|EST.TIME(us)|                                                                                                                    |
---------------------------------------------------------------                                                                                                                    |
| |0 |TABLE RANGE SCAN|customers(idx_user)|1       |7           |                                                                                                                    |
| ===============================================================                                                                                                                    |
| Outputs & filters:                                                                                                                                                                 |
-------------------------------------                                                                                                                                              |
|   0 - output([customers.id], [customers.modified], [customers.custinfo], [column_conv(VARCHAR,utf8mb4_general_ci,length:20,NULL,cast(json_unquote(json_extract(customers.custinfo, |
|        '$.user')), VARCHAR(1048576)))]), filter(nil)                                                                                                                               |
|       access([customers.id], [customers.custinfo], [customers.v_user], [customers.modified]), partitions(p0)                                                                       |
|       is_index_back=true, is_global_index=false,                                                                                                                                   |
|       range_key([customers.v_user], [customers.id]), range(Bob,MIN ; Bob,MAX),                                                                                                     |
|       range_cond([customers.v_user = 'Bob'])                                                                                                                                       |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
13 rows in set1 warning (0.02 sec)


上面方法如果要对 JSON 字段 zipcode
做索引就有点困难,这个列的值是个数组。这就用到新推出的多值索引功能。
先看查询场景 SQL 。
EXPLAIN SELECT * FROM customers WHERE 94507 MEMBER OF(custinfo->'$.zipcode');
EXPLAIN SELECT * FROM customers WHERE JSON_CONTAINS(custinfo->'$.zipcode'CAST('[94507,94582]' AS JSON));
EXPLAIN SELECT * FROM customers WHERE JSON_OVERLAPS(custinfo->'$.zipcode'CAST('[94507,94582]' AS JSON));

这三个 SQL 使用了常见的 JSON 函数,执行计划全部是全表扫描就不发了。应对方法就是新增多值索引。
mysql> ALTER TABLE customers ADD INDEX zips( (CAST(custinfo->'$.zipcode' AS UNSIGNED ARRAY)) );
ERROR 1235 (0A000): dynamic add multivalue index not supported yet not supported
mysql> 

遗憾的是由于 OB 4.3.2 是刚支持多值索引,目前还只实现在建表的时候创建多值索引,暂不支持后期动态添加多值索引。
所以我们再创建一个带多值索引的新表看看。
create table customers2(
  id bigint not null auto_increment  ,
  modified datetime default current_timestamp on update current_timestamp,
  custinfo json,
  index zips((cast(custinfo->'$.zipcode' as unsigned array))),
  INDEX comp(id, modified,(cast(custinfo->'$.zipcode' as unsigned array)))
);

INSERT INTO customers2 SELECT * FROM customers;

为了减少篇幅,我这里一次性创建两类多值索引。一个是针对单列的多值索引,一个是多列组合索引。最后一个组合索引中,只能引用一个多值列,并且该列的定义顺序不影响组合索引的使用。
mysql> EXPLAIN SELECT * FROM customers2 WHERE 94507 MEMBER OF(custinfo->'$.zipcode');                                                                                                                                                                                       
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+                                                                                                            
| Query Plan                                                                                                                                                   |                                                                                                            
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+                                                                                                            
| ===========================================================                                                                                                  |                                                                                                            
| |ID|OPERATOR       |NAME            |EST.ROWS|EST.TIME(us)|                                                                                                  |                                                                                                            
-----------------------------------------------------------                                                                                                  |                                                                                                            
| |0 |TABLE FULL SCAN|customers2(zips)|2       |13          |                                                                                                  |                                                                                                            
| ===========================================================                                                                                                  |                                                                                                            
| Outputs & filters:                                                                                                                                           |                                                                                                            
-------------------------------------                                                                                                                        |                                                                                                            
|   0 - output([customers2.id], [customers2.modified], [customers2.custinfo]), filter([JSON_MEMBER_OF(94507, JSON_EXTRACT(customers2.custinfo, '$.zipcode'))]) |                                                                                                            
|       access([customers2.__pk_increment], [customers2.custinfo], [customers2.id], [customers2.modified]), partitions(p0)                                     |                                                                                                            
|       is_index_back=true, is_global_index=false, filter_before_indexback[false],                                                                             |                                                                                                            
|       range_key([customers2.SYS_NC_mvi_19], [customers2.__pk_increment], [customers2.__doc_id_1727685398954214]), range(94507,MIN,MIN ; 94507,MAX,MAX)       |                                                                                                            
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+                                                                                                            
11 rows in set (0.00 sec)                                                                                                                                                                                                                                                   
                                                                                                                                                                                                                                                                            
mysql> EXPLAIN SELECT * FROM customers2 WHERE JSON_CONTAINS(custinfo->'$.zipcode'CAST('[94507,94582]' AS JSON));                                                                                                                                                          
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+                                                                                                        
| Query Plan                                                                                                                                                       |                                                                                                        
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+                                                                                                        
| ===========================================================                                                                                                      |                                                                                                        
| |ID|OPERATOR       |NAME            |EST.ROWS|EST.TIME(us)|                                                                                                      |                                                                                                        
-----------------------------------------------------------                                                                                                      |                                                                                                        
| |0 |TABLE FULL SCAN|customers2(zips)|3       |23          |                                                                                                      |                                                                                                        
| ===========================================================                                                                                                      |                                                                                                        
| Outputs & filters:                                                                                                                                               |                                                                                                        
-------------------------------------                                                                                                                            |
|   0 - output([customers2.id], [customers2.modified], [customers2.custinfo]), filter([JSON_CONTAINS(JSON_EXTRACT(customers2.custinfo, '$.zipcode'), cast('[94507, |
|       94582]', JSON(536870911)))])                                                                                                                               |
|       access([customers2.__pk_increment], [customers2.custinfo], [customers2.id], [customers2.modified]), partitions(p0)                                         |
|       is_index_back=true, is_global_index=false, filter_before_indexback[false],                                                                                 |
|       range_key([customers2.SYS_NC_mvi_19], [customers2.__pk_increment], [customers2.__doc_id_1727685398954214]), range(94507,MIN,MIN ; 94507,MAX,MAX),          |
|       (94582,MIN,MIN ; 94582,MAX,MAX)                                                                                                                            |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
13 rows in set (0.00 sec)
mysql> EXPLAIN SELECT * FROM customers2 WHERE JSON_OVERLAPS(custinfo->'$.zipcode'CAST('[94507,94582]' AS JSON));
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query Plan                                                                                                                                                       |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ===========================================================                                                                                                      |
| |ID|OPERATOR       |NAME            |EST.ROWS|EST.TIME(us)|                                                                                                      |
-----------------------------------------------------------                                                                                                      |
| |0 |TABLE FULL SCAN|customers2(zips)|3       |23          |                                                                                                      |
| ===========================================================                                                                                                      |
| Outputs & filters:                                                                                                                                               |
-------------------------------------                                                                                                                            |
|   0 - output([customers2.id], [customers2.modified], [customers2.custinfo]), filter([JSON_OVERLAPS(JSON_EXTRACT(customers2.custinfo, '$.zipcode'), cast('[94507, |
|       94582]', JSON(536870911)))])                                                                                                                               |
|       access([customers2.__pk_increment], [customers2.custinfo], [customers2.id], [customers2.modified]), partitions(p0)                                         |
|       is_index_back=true, is_global_index=false, filter_before_indexback[false],                                                                                 |
|       range_key([customers2.SYS_NC_mvi_19], [customers2.__pk_increment], [customers2.__doc_id_1727685398954214]), range(94507,MIN,MIN ; 94507,MAX,MAX),          |
|       (94582,MIN,MIN ; 94582,MAX,MAX)                                                                                                                            |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
13 rows in set (0.00 sec)

mysql> EXPLAIN SELECT * FROM customers2 WHERE id = 23 and modified = 103 and 94507 MEMBER OF(custinfo->'$.zipcode');
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query Plan                                                                                                                                                   |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ============================================================                                                                                                 |
| |ID|OPERATOR        |NAME            |EST.ROWS|EST.TIME(us)|                                                                                                 |
------------------------------------------------------------                                                                                                 |
| |0 |TABLE RANGE SCAN|customers2(comp)|1       |7           |                                                                                                 |
| ============================================================                                                                                                 |
| Outputs & filters:                                                                                                                                           |
-------------------------------------                                                                                                                        |
|   0 - output([customers2.id], [customers2.modified], [customers2.custinfo]), filter([JSON_MEMBER_OF(94507, JSON_EXTRACT(customers2.custinfo, '$.zipcode'))]) |
|       access([customers2.__pk_increment], [customers2.custinfo], [customers2.id], [customers2.modified]), partitions(p0)                                     |
|       is_index_back=true, is_global_index=false, filter_before_indexback[false],                                                                             |
|       range_key([customers2.id], [customers2.modified], [customers2.SYS_NC_mvi_21], [customers2.__pk_increment], [customers2.__doc_id_1727685398954214]),    |
|        range(23,2000-01-03 00:00:00.000000,MIN,MIN,MIN ; 23,2000-01-03 00:00:00.000000,MAX,MAX,MAX),                                                         |
|       range_cond([customers2.id = 23], [customers2.modified = INTERNAL_FUNCTION(103, 110, 17)])                                                              |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
13 rows in set (0.00 sec)

这里直接将 4 种使用多值索引的查询场景 SQL 以及执行计划放出,这执行计划中的细节大家可以一一详细查看。
多值索引的结构如下。
mysql> show indexes from customers2;
+------------+------------+----------+--------------+---------------------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+--------------------------------------------------------------------------------------------------------------------------------+
| Table      | Non_unique | Key_name | Seq_in_index | Column_name               | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment   | Index_comment | Visible | Expression                                                                                                                     |
+------------+------------+----------+--------------+---------------------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+--------------------------------------------------------------------------------------------------------------------------------+
| customers2 |          1 | zips     |            1 | SYS_NC_mvi_19             | A         |        NULL | NULL     | NULL   | YES  | BTREE      | available |               | YES     | json_query(`custinfo`,'$.zipcode' RETURNING unsigned WITHOUT ARRAY WRAPPER asis error on error null on empty null on mismatch) |
| customers2 |          1 | zips     |            2 | __pk_increment            | A         |        NULL | NULL     | NULL   |      | BTREE      | available |               | YES     | NULL                                                                                                                           |
| customers2 |          1 | zips     |            3 | __doc_id_1727685398954214 | A         |        NULL | NULL     | NULL   |      | BTREE      | available |               | YES     | NULL                                                                                                                           |
| customers2 |          1 | comp     |            1 | id                        | A         |        NULL | NULL     | NULL   |      | BTREE      | available |               | YES     | NULL                                                                                                                           |
| customers2 |          1 | comp     |            2 | modified                  | A         |        NULL | NULL     | NULL   | YES  | BTREE      | available |               | YES     | NULL                                                                                                                           |
| customers2 |          1 | comp     |            3 | SYS_NC_mvi_21             | A         |        NULL | NULL     | NULL   | YES  | BTREE      | available |               | YES     | json_query(`custinfo`,'$.zipcode' RETURNING unsigned WITHOUT ARRAY WRAPPER asis error on error null on empty null on mismatch) |
| customers2 |          1 | comp     |            4 | __pk_increment            | A         |        NULL | NULL     | NULL   |      | BTREE      | available |               | YES     | NULL                                                                                                                           |
| customers2 |          1 | comp     |            5 | __doc_id_1727685398954214 | A         |        NULL | NULL     | NULL   |      | BTREE      | available |               | YES     | NULL                                                                                                                           |
+------------+------------+----------+--------------+---------------------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+--------------------------------------------------------------------------------------------------------------------------------+
8 rows in set (0.00 sec)

还有一类特殊的场景是多值索引的唯一性索引。
CREATE TABLE customers3 (
 id BIGINT not null primary key,
 modified BIGINT not null,
 custinfo JSON,
 UNIQUE INDEX zips1( (CAST(custinfo->'$.zipcode' AS UNSIGNED ARRAY)) )
);

INSERT INTO customers3 VALUES
 (1021'{"user":"Jack","user_id":37,"zipcode":[94582,94536]}');
 
mysql> select * from customers3;
+----+----------+------------------------------------------------------------+
| id | modified | custinfo                                                   |
+----+----------+------------------------------------------------------------+
| 10 |       21 | {"user": "Jack", "user_id": 37, "zipcode": [94582, 94536]} |
+----+----------+------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> INSERT INTO customers3 VALUES  (1122'{"user":"Jill","user_id":22,"zipcode":[94568,94507,94582]}');
ERROR 1062 (23000): Duplicate entry '94582' for key 'zips1'
mysql> 

mysql> show indexes from customers3;
+------------+------------+----------+--------------+---------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+--------------------------------------------------------------------------------------------------------------------------------+
| Table      | Non_unique | Key_name | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment   | Index_comment | Visible | Expression                                                                                                                     |
+------------+------------+----------+--------------+---------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+--------------------------------------------------------------------------------------------------------------------------------+
| customers3 |          0 | PRIMARY  |            1 | id            | A         |        NULL | NULL     | NULL   |      | BTREE      | available |               | YES     | NULL                                                                                                                           |
| customers3 |          0 | zips1    |            1 | SYS_NC_mvi_19 | A         |        NULL | NULL     | NULL   | YES  | BTREE      | available |               | YES     | json_query(`custinfo`,'$.zipcode' RETURNING unsigned WITHOUT ARRAY WRAPPER asis error on error null on empty null on mismatch) |
+------------+------------+----------+--------------+---------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+--------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

由此可见唯一性的多值索引能拦截导致多值数组中出现重复的值。
多值索引也有一些功能限制。
  • 每个多值索引只允许引用一个多值列。

  • 多值索引不能定义列的顺序 ASC 或 DESC,也不能用于消除排序,多值索引不能用于主键或外键。
EXPLAIN SELECT v_user FROM customers order by v_user;
EXPLAIN SELECT custinfo FROM customers2 order by custinfo;


更多阅读: 


文章转载自数据库技术闲谈,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论