背景
MySQL的数据存储在表空间中,有的时候我们删除了一张表的一半的数据,但是发现表空间文件的大小并没有减少,这是什么原因呢?
MySQL数据空洞
当对一条数据执行delete
操作时,MySQL将数据删除后,并未将数据占用的空间返还给操作系统,而是将当前空间标记为"可复用",当有新的数据插入时,则不会重新申请空间,而是插入到"可复用"空间中,这种"可复用"空间,称之为数据空洞。
MySQL官方文档对此的解释如下:
After deleting a large part of a MyISAM or ARCHIVE table, or making many changes to a MyISAM or ARCHIVE table with variable-length rows (tables that have VARCHAR, VARBINARY, BLOB, or TEXT columns).
Deleted rows are maintained in a linked list and subsequent INSERT operations reuse old row positions.
数据空洞的好处是尽可能的复用表空间结构,带来的问题也是显而易见,当删除数据后,表空间并未及时的释放,当长时间没有新的数据填充,会造成空间浪费的情况。
OPTIMIZE TABLE命令
MySQL提供了命令OPTIMIZE TABLE
来整理表空间结构,来清理释放未使用的表空间结构,以此提升表的性能。
MySQL官方文档对此的解释如下:
You can use OPTIMIZE TABLE to reclaim the unused space and to defragment the data file. After extensive changes to a table, this statement may also improve performance of statements that use the table, sometimes significantly.
OPTIMIZE TABLE works for InnoDB, MyISAM, and ARCHIVE tables.
OPTIMIZE TABLE is also supported for dynamic columns of in-memory NDB tables. It does not work for fixed-width columns of in-memory tables, nor does it work for Disk Data tables.
对于InnoDB,OPTIMIZE TABLE
等价于ALTER TABLE...FORCE
,相当于对数据表的聚簇索引进行重建,清理未使用的空间。
下面来看一下执行OPTIMIZE TABLE
的效果:
mysql> OPTIMIZE TABLE foo;
+----------+----------+----------+-------------------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+----------+----------+----------+-------------------------------------------------------------------+
| test.foo | optimize | note | Table does not support optimize, doing recreate + analyze instead |
| test.foo | optimize | status | OK |
+----------+----------+----------+-------------------------------------------------------------------+
OPTIMIZE TABLE背后细节
1、OPTIMIZE TABLE
使用online DDL
机制,以此来降低并发DML操作场景下的停顿时间。
2、OPTIMIZE TABLE
操作会在prepare
和commit
两个阶段对表加锁。
3、在prepare
阶段,会对数据表的元数据进行更新,并创建临时表。
4、在commit
阶段,数据表的元数据更新会进行提交。
5、online DDL
机制不支持InnoDB
引擎的FULLTEXT
索引,如果使用FULLTEXT
索引,那么将会变为表数据拷贝的方式进行重建。
MySQL官方文档对此的解释如下:
OPTIMIZE TABLE uses online DDL for regular and partitioned InnoDB tables,
which reduces downtime for concurrent DML operations.
The table rebuild triggered by OPTIMIZE TABLE is completed in place.
An exclusive table lock is only taken briefly during the prepare phase and the commit phase of the operation.
During the prepare phase, metadata is updated and an intermediate table is created.
During the commit phase, table metadata changes are committed.
参考文档:
OPTIMIZE TABLE Statement:
https://dev.mysql.com/doc/refman/8.0/en/optimize-table.html
InnoDB and Online DDL
https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl.html
MYSQL 5.7 到底 OPTIMIZE Table 塞不塞 DML
https://cloud.tencent.com/developer/article/1606990