暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

MySQL:压缩、索引和两个冒烟的桶

原创 谭磊Terry 恩墨学院 2022-07-23
255

大家好,我们都知道 MySQL 中表压缩的好处,以目前的技术、处理速度和压缩磁盘,在大多数情况下,这是一个好处。

现在,如果没有发生,会发生什么?

在这个简单的博客中,我将向您展示当您尝试从压缩表转换为普通表时发生的一些有趣的事情。我们不打算详细讨论或讨论其他任何事情,例如:为什么压缩表会很糟糕?什么是性能、系统性能,甚至不是压缩所需的硬件资源损失?我们也不打算讨论内存中的页面是压缩的、未压缩的,还是两者兼而有之。这已经讨论得够多了,您可以在我们的网站甚至文档本身中找到一些非常有趣的博客。

我要向您展示的是非常好奇的东西,为此,我们将从头到尾进行练习,以便您可以自己完成(好吧,除了数据加载部分,嘿!),我将给出一个示例脚本(稍后)。

让我们在安装了 Percona Server for MySQL 5.7 的数据库实例中创建表。

CentOS7-S1 PS_57> CREATE TABLE test_compress (
    id bigint(20) unsigned NOT NULL,
    identification_id int(10) unsigned DEFAULT NULL,
    timestamp datetime NOT NULL,
    action varchar(50) NOT NULL,
    result varchar(50) NOT NULL,
    PRIMARY KEY (id),
    KEY INDEX_test_compress_result (result),
    KEY INDEX_test_compress_timestamp (timestamp)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.02 sec)

之后,让我们将一些数据(100,000 行)加载到表中。

CentOS7-S1 $ for NUM in {1..100000}; do mysql -h localhost PS_57 -e "insert into test_compress (id, identification_id, timestamp, action, result) values ($NUM,$NUM*100,now(),concat('string',$NUM),concat('VeryVeryLargeString',$NUM))"; done

让我们验证表的大小(之前使用 innodb_stats_persistent_sample_pages=100000 进行 ANALYZE TABLE 以使统计信息尽可能真实)。

CentOS7-S1 PS_57> set global innodb_stats_persistent_sample_pages=100000; Query OK, 0 rows affected (0.00 sec)
CentOS7-S1 PS_57> analyze table test_compress;
+------------------------+---------+----------+----------+
| Table                  | Op      | Msg_type | Msg_text |
+------------------------+---------+----------+----------+
| PS_57.test_compress    | analyze | status   | OK       |
+------------------------+---------+----------+----------+
Query OK, 0 rows affected (0.00 sec)
CentOS7-S1 PS_57> select table_schema, table_name, table_rows,  round(data_length / 1024 / 1024)+round(index_length / 1024 / 1024)+round(data_free / 1024 / 1024) TOTAL_MB, create_options from information_schema.tables where table_name='test_compress';
+--------------+---------------+------------+----------+----------------+
| table_schema | table_name    | table_rows | TOTAL_MB | create_options |
+--------------+---------------+------------+----------+----------------+
| PS_57        | test_compress |     100000 |       37 |                |
+--------------+---------------+------------+----------+----------------+

接下来,我们将使用 KEY_BLOCK_SIZE=4 压缩表(这个大小是任意选择的,在任何时候都没有指示或决定它是否是最佳值,实际上它不是)。

CentOS7-S1 PS_57> ALTER TABLE test_compress ROW_FORMAT=COMPRESSED,KEY_BLOCK_SIZE=4,ALGORITHM=INPLACE,LOCK=NONE;
Query OK, 0 rows affected (3.33 sec)

我们再次验证表的大小(之前使用 innodb_stats_persistent_sample_pages=100000 执行 ANALYZE TABLE,以便统计信息尽可能真实)。

CentOS7-S1 PS_57> set global innodb_stats_persistent_sample_pages=100000;
Query OK, 0 rows affected (0.00 sec)
CentOS7-S1 PS_57> analyze table test_compress;
+------------------------+---------+----------+----------+
| Table                  | Op      | Msg_type | Msg_text |
+------------------------+---------+----------+----------+
| PS_57.test_compress    | analyze | status   | OK       |
+------------------------+---------+----------+----------+
Query OK, 0 rows affected (0.00 sec)
CentOS7-S1 PS_57> select table_schema, table_name, table_rows,  round(data_length / 1024 / 1024)+round(index_length / 1024 / 1024)+round(data_free / 1024 / 1024) TOTAL_MB, create_options from information_schema.tables where table_name='test_compress';
+--------------+---------------+------------+----------+----------------------------------------+
| table_schema | table_name    | table_rows | TOTAL_MB | create_options                         |
+--------------+---------------+------------+----------+----------------------------------------+
| PS_57        | test_compress |     100000 |       19 | row_format=COMPRESSED KEY_BLOCK_SIZE=4 |
+--------------+---------------+------------+----------+----------------------------------------+

可以看到,表已经被压缩了,我们来看看它的结构。

CentOS7-S1 PS_57> show create table test_compress\G
*************************** 1. row ***************************
       Table: test_compress
Create Table: CREATE TABLE `test_compress` (
  `id` bigint(20) unsigned NOT NULL,
  `identification_id` int(10) unsigned DEFAULT NULL,
  `timestamp` datetime NOT NULL,
  `action` varchar(50) NOT NULL,
  `result` varchar(50) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `INDEX_test_compress_result` (`result`),
  KEY `INDEX_test_compress_timestamp` (`timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4
1 row in set (0.00 sec)

我们如何扭转压缩?好问题。应该很容易吧?

(澄清一下,在你问之前,我们不会根据表的大小来判断应该如何完成,这个 ALTER 会对复制产生什么影响,以及其他事情,这个想法是展示如何反转它而不是什么别的。)

我们可以使用这个命令,让我们看看:

CentOS7-S1 PS_57> ALTER TABLE test_compress ROW_FORMAT=DEFAULT,ALGORITHM=INPLACE,LOCK=NONE;
Query OK, 0 rows affected (6.25 sec)
Records: 0  Duplicates: 0  Warnings: 0

它似乎奏效了!解压它:

CentOS7-S1 PS_57> select table_schema, table_name, table_rows,  round(data_length / 1024 / 1024)+round(index_length / 1024 / 1024)+round(data_free / 1024 / 1024) TOTAL_MB, create_options from information_schema.tables where table_name='test_compress';
+--------------+---------------+------------+----------+--------------------+
| table_schema | table_name    | table_rows | TOTAL_MB | create_options     |
+--------------+---------------+------------+----------+--------------------+
| PS_57        | test_compress |     100000 |       25 | KEY_BLOCK_SIZE=4   |
+--------------+---------------+------------+----------+--------------------+

更好的检查:

CentOS7-S1 PS_57> show create table test_compress\G
*************************** 1. row ***************************
       Table: test_compress
Create Table: CREATE TABLE `test_compress` (
  `id` bigint(20) unsigned NOT NULL,
  `identification_id` int(10) unsigned DEFAULT NULL,
  `timestamp` datetime NOT NULL,
  `action` varchar(50) NOT NULL,
  `result` varchar(50) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `INDEX_test_compress_result` (`result`),
  KEY `INDEX_test_compress_timestamp` (`timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 KEY_BLOCK_SIZE=4

有什么不对劲!KEY_BLOCK_SIZE 仍然是 4。

第二次尝试:

CentOS7-S1 PS_57> ALTER TABLE test_compress ROW_FORMAT=DEFAULT,KEY_BLOCK_SIZE=0,ALGORITHM=INPLACE,LOCK=NONE;
Query OK, 0 rows affected (2.05 sec)
Records: 0  Duplicates: 0  Warnings: 0
CentOS7-S1 PS_57> select table_schema, table_name, table_rows,  round(data_length / 1024 / 1024)+round(index_length / 1024 / 1024)+round(data_free / 1024 / 1024) TOTAL_MB, create_options from information_schema.tables where table_name='test_compress';
+--------------+---------------+------------+----------+--------------------+
| table_schema | table_name    | table_rows | TOTAL_MB | create_options     |
+--------------+---------------+------------+----------+--------------------+
| PS_57        | test_compress |     100000 |       25 |                    |
+--------------+---------------+------------+----------+--------------------+

更好的检查:

CentOS7-S1 PS_57> show create table test_compress\G
*************************** 1. row ***************************
       Table: test_compress
Create Table: CREATE TABLE `test_compress` (
  `id` bigint(20) unsigned NOT NULL,
  `identification_id` int(10) unsigned DEFAULT NULL,
  `timestamp` datetime NOT NULL,
  `action` varchar(50) NOT NULL,
  `result` varchar(50) NOT NULL,
  PRIMARY KEY (`id`) KEY_BLOCK_SIZE=4,
  KEY `INDEX_test_compress_result` (`result`) KEY_BLOCK_SIZE=4,
  KEY `INDEX_test_compress_timestamp` (`timestamp`) KEY_BLOCK_SIZE=4
) ENGINE=InnoDB DEFAULT CHARSET=latin1

有什么不对劲!主键和二级索引都继续显示 KEY_BLOCK_SIZE=4。

尽管当表从压缩转换为未压缩时,索引的 KEY_BLOCK_SIZE 在内部会遵循表的索引,但 CREATE TABLE 语句不会。起初这将是一个美学/外观问题,但是当您进行转储时这是一个真正的问题,因为 CREATE TABLE 留下了 KEY_BLOCK_SIZE 值,这不好。这是mysqldump的输出:

$ mysqldump -h localhost PS_57 test_compress --no-data > test_compress.sql
$ cat test_compress.sql
...
--
-- Table structure for table `test_compress`
--

DROP TABLE IF EXISTS `test_compress`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!50503 SET character_set_client = utf8mb4 */;
CREATE TABLE `test_compress` (
  `id` bigint(20) unsigned NOT NULL,
  `identification_id` int(10) unsigned DEFAULT NULL,
  `timestamp` datetime NOT NULL,
  `action` varchar(50) NOT NULL,
  `result` varchar(50) NOT NULL,
  PRIMARY KEY (`id`) KEY_BLOCK_SIZE=4,
  KEY `INDEX_test_compress_result` (`result`) KEY_BLOCK_SIZE=4,
  KEY `INDEX_test_compress_timestamp` (`timestamp`) KEY_BLOCK_SIZE=4
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
/*!40101 SET character_set_client = @saved_cs_client */;

如您所见,似乎没有办法使用全局 ALTER TABLE 命令(如果我们可以这样称呼它)来反转表定义中的 KEY_BLOCK_SIZE 索引,因此我们将进行最后一次尝试:

ALTER TABLE test_compress 
DROP PRIMARY KEY, add PRIMARY KEY (id), 
DROP key INDEX_test_compress_result, add key INDEX_test_compress_result (result), 
DROP key INDEX_test_compress_timestamp, add key INDEX_test_compress_timestamp (timestamp),
ROW_FORMAT=DEFAULT,KEY_BLOCK_SIZE=0,ALGORITHM=INPLACE,LOCK=NONE;

现在,它在没有 KEY_BLOCK_SIZE 的情况下具有正确的定义:

CentOS7-S1 PS_57> show create table test_compress\G
*************************** 1. row ***************************
       Table: test_compress
Create Table: CREATE TABLE `test_compress` (
  `id` bigint(20) unsigned NOT NULL,
  `identification_id` int(10) unsigned DEFAULT NULL,
  `timestamp` datetime NOT NULL,
  `action` varchar(50) NOT NULL,
  `result` varchar(50) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `INDEX_test_compress_result` (`result`),
  KEY `INDEX_test_compress_timestamp` (`timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
CentOS7-S1 PS_57> select table_schema, table_name, table_rows,  round(data_length / 1024 / 1024)+round(index_length / 1024 / 1024)+round(data_free / 1024 / 1024) TOTAL_MB, create_options from information_schema.tables where table_name='test_compress';
+--------------+---------------+------------+----------+----------------+
| table_schema | table_name    | table_rows | TOTAL_MB | create_options |
+--------------+---------------+------------+----------+----------------+
| PS_57        | test_compress |     100000 |       25 |                |
+--------------+---------------+------------+----------+----------------+

显然,MariaDB 中存在与该案例相关的错误:

https://jira.mariadb.org/browse/MDEV-26400

https://jira.mariadb.org/browse/MDEV-11757

MySQL中的一个类似的:

https://bugs.mysql.com/bug.php?id=56628

在 MySQL 8 中,情况如下:

CentOS7-S2 PS_8> select table_schema, table_name, table_rows,  round(data_length / 1024 / 1024)+round(index_length / 1024 / 1024)+round(data_free / 1024 / 1024) TOTAL_MB, create_options from information_schema.tables where table_name='test_compress';
+--------------+---------------+------------+----------+----------------+
| TABLE_SCHEMA | TABLE_NAME    | TABLE_ROWS | TOTAL_MB | CREATE_OPTIONS |
+--------------+---------------+------------+----------+----------------+
| PS_8         | test_compress |      31000 |       15 |                |
+--------------+---------------+------------+----------+----------------+

让我们执行 ALTER 来压缩表:

CentOS7-S2 PS_8> alter table test_compress ROW_FORMAT=COMPRESSED,KEY_BLOCK_SIZE=4,ALGORITHM=INPLACE,LOCK=NONE;
Query OK, 0 rows affected (4.54 sec)
Records: 0  Duplicates: 0  Warnings: 0

让我们再次检查:

CentOS7-S2 PS_8> analyze table test_compress;
+-----------------------+---------+----------+----------+
| Table                 | Op      | Msg_type | Msg_text |
+-----------------------+---------+----------+----------+
| PS_8.test_compress    | analyze | status   | OK       |
+-----------------------+---------+----------+----------+
1 row in set (0.07 sec)
CentOS7-S2 PS_8> select table_schema, table_name, table_rows,  round(data_length / 1024 / 1024)+round(index_length / 1024 / 1024)+round(data_free / 1024 / 1024) TOTAL_MB, create_options from information_schema.tables where table_name='test_compress';
+--------------+---------------+------------+----------+----------------------------------------+
| TABLE_SCHEMA | TABLE_NAME    | TABLE_ROWS | TOTAL_MB | CREATE_OPTIONS                         |
+--------------+---------------+------------+----------+----------------------------------------+
| PS_8         | test_compress |     100000 |       19 | row_format=COMPRESSED KEY_BLOCK_SIZE=4 |
+--------------+---------------+------------+----------+----------------------------------------+
CentOS7-S2 PS_8> show create table test_compress\G
*************************** 1. row ***************************
       Table: test_compress
Create Table: CREATE TABLE `test_compress` (
  `id` bigint unsigned NOT NULL,
  `identification_id` int unsigned DEFAULT NULL,
  `timestamp` datetime NOT NULL,
 `action` varchar(50) NOT NULL,
  `result` varchar(50) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `INDEX_test_compress_result` (`result`),
  KEY `INDEX_test_compress_timestamp` (`timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4
1 row in set (0.01 sec)

到目前为止,一切都与 MySQL 5.7 中的相同:KEY_BLOCK_SIZE 保留在整个表的定义中,但不保留在索引中。

幸运的是,通过运行这个命令,我们有效地扭转了一切:

CentOS7-S2 PS_8> alter table test_compress ROW_FORMAT=DEFAULT, KEY_BLOCK_SIZE=0,ALGORITHM=INPLACE,LOCK=NONE;
Query OK, 0 rows affected (2.56 sec)
Records: 0  Duplicates: 0  Warnings: 0
CentOS7-S2 PS_8> show create table test_compress\G
*************************** 1. row ***************************
       Table: test_compress
Create Table: CREATE TABLE `test_compress` (
  `id` bigint unsigned NOT NULL,
  `identification_id` int unsigned DEFAULT NULL,
  `timestamp` datetime NOT NULL,
  `action` varchar(50) NOT NULL,
  `result` varchar(50) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `INDEX_test_compress_result` (`result`),
  KEY `INDEX_test_compress_timestamp` (`timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
CentOS7-S2 PS_8> select table_schema, table_name, table_rows,  round(data_length / 1024 / 1024)+round(index_length / 1024 / 1024)+round(data_free / 1024 / 1024) TOTAL_MB, create_options from information_schema.tables where table_name='test_compress';
+--------------+---------------+------------+----------+----------------+
| TABLE_SCHEMA | TABLE_NAME    | TABLE_ROWS | TOTAL_MB | CREATE_OPTIONS |
+--------------+---------------+------------+----------+----------------+
| PS_8         | test_compress |     100000 |       25 |                |
+--------------+---------------+------------+----------+----------------+

与往常一样,我们建议在生产环境中运行之前在开发环境中进行初步测试。

结论

在 MySQL 5.7 中,完全还原的唯一方法(至少在表及其索引的定义中)是重新生成主键及其所有索引。这听起来像是一个终端解决方案,但如果您使用 mysqldump 进行备份(我们总是建议使用Percona XtraBackup来实现这些目的,它会更快更高效),这是一个需要考虑的问题,因为它在定义中保留了那些错误的定义。幸运的是,这在 MySQL 8 中已修复。

很有趣,不是吗?

原文标题:MySQL: Compression, Indexes, and Two Smoking Barrels
原文作者:Fernando Mattera
原文地址:https://www.percona.com/blog/mysql-compression-indexes-and-two-smoking-barrels/

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论