起因:
一个mysql研发测试环境,单机,版本mysql5.7.24。
一线人员发现mysql挂了,磁盘占用100%. 于是他删除了部分binlog,腾出2G空间,启动mysql,正常。
因为不知道哪些数据表的数据可以truncate,他打算回收一些大表delete后的空间。找到一个最大表占14G的,optimize table table_name;发现空间一直上涨,因为会生成临时文件#sql34433.sql。于是想移走几个表腾出空间让optimize完成再移回来(错误操作,这时候没有停止mysql,应用还在读写)。发现mv几个表ibd文件后,空间没有腾出,此时optimize导致磁盘再次占满,mysql中断。
一线人员执行了lsof | grep deleted | awk ‘{ print $2 }’ | xargs -i kill -9 \{\};。回收了刚才mv的空间。重启mysql,无法启动。查看错误日志
2021-10-12 11:25:00 0x2b633c1c6700 InnoDB: Assertion failure in thread 47705210251008 in file btr0pcur.cc line 454
InnoDB: Failing assertion: btr_page_get_prev(next_page, mtr) == btr_pcur_get_block(cursor)->page.id.page_no()
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
03:25:00 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.
key_buffer_size=268435456
read_buffer_size=1048576
max_used_connections=414
max_threads=2000
thread_count=338
connection_count=337
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 6432909 K bytes of memory
Hope that’s ok; if not, decrease some variables in the equation.
Thread pointer: 0x2b64300faf70
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong…
stack_bottom = 2b633c1c5ea8 thread_stack 0x40000
2021-10-12 11:25:00 0x2b633cbad700 InnoDB: Assertion failure in thread 47705220634368 in file btr0pcur.cc line 454
InnoDB: Failing assertion: btr_page_get_prev(next_page, mtr) == btr_pcur_get_block(cursor)->page.id.page_no()
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
2021-10-12 11:25:00 0x2b633c040700 InnoDB: Assertion failure in thread 47705208653568 in file btr0pcur.cc line 454
InnoDB: Failing assertion: btr_page_get_prev(next_page, mtr) == btr_pcur_get_block(cursor)->page.id.page_no()
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
救援:
- 设置innodb_force_recovery
在my.cnf里面设置innodb_force_recovery=1,2,3,4,5,6,直到6,启动mysql
innodb_force_recovery的参数说明:
1(SRV_FORCE_IGNORE_CORRUPT):忽略检查到的corrupt页。
2(SRV_FORCE_NO_BACKGROUND):阻止主线程的运行,如主线程需要执行full purge操作,会导致crash。
3(SRV_FORCE_NO_TRX_UNDO):不执行事务回滚操作。
4(SRV_FORCE_NO_IBUF_MERGE):不执行插入缓冲的合并操作。
5(SRV_FORCE_NO_UNDO_LOG_SCAN):不查看重做日志,InnoDB存储引擎会将未提交的事务视为已提交。**
6(SRV_FORCE_NO_LOG_REDO):不执行前滚的操作。
2.导出数据
1)全库导出
mysqldump -uroot -pxxxxxx --single-transaction --master-data=2 --set-gtid-purged=OFF -R -E --triggers --all-databases > /mysql/mysqlbackup/backup_20211012.sql
mysqldump: Error 2013: Lost connection to MySQL server during query when dumping
全库导出失败。
2)导出正常库
mysqldump -uroot -pxxxxxx --single-transaction --master-data=2 --set-gtid-purged=OFF -R -E --triggers -B sou_132 > /mysql/mysqlbackup/sou_132.sql
mysqldump -uroot -pxxxxxx --single-transaction --master-data=2 --set-gtid-purged=OFF -R -E --triggers -B mysql > /mysql/mysqlbackup/mysql.sql
3)导出异常库
导出表结构:
mysqldump -uroot -pxxxxxx -d gene --triggers --routines --events --skip-lock-tables --set-gtid-purged=off --single-transaction>gene_desc.sql
导出表结构和数据
mysqldump -uroot -pxxxxxx --single-transaction --master-data=2 --set-gtid-purged=OFF -R -E --triggers -B gene > /mysql/mysqlbackup/gene.sql
mysqldump: Error 2013: Lost connection to MySQL server during query when dumping
这个过程,遇到损坏表会中断,然后mv损坏表,再重新mysqldump,如此反复,直到完成。 最终包含的数据是不含损坏表数据。
3.备份ibd文件
4.清除innodb_force_recovery
vi my.cnf
5.初始化mysql实例
删除已存在数据和日志
mysqld --defaults-file=/etc/my.cnf --initialize --user=mysql
获取临时密码 cat mysqld.log|grep password
mysql -uroot -p临时密码
alter user ‘root’@‘localhost’ identified by 'xxx_123’;
flush privileges;
6.调整参数
为了加快导入速度,建议关闭双11,甚至关闭binlog日志(my.cnf注释log_bin)
修改前查参数值
show GLOBAL variables like ‘foreign_key_checks’;
show GLOBAL variables like ‘unique_checks’;
show GLOBAL variables like ‘innodb_flush_log_at_trx_commit’;
show GLOBAL variables like ‘sync_binlog’;
执行如下加速操作
SET GLOBAL foreign_key_checks=0;
SET GLOBAL unique_checks=0;
SET GLOBAL innodb_flush_log_at_trx_commit=2;
SET GLOBAL sync_binlog=1000;
7.导入数据
mysql -uroot -pxxxxxx <mysql.sql
mysql -uroot -pxxxxxx --force <sou_132.sql
mysql -uroot -pxxxxxx general --force <gene_desc.sql
mysql -uroot -pxxxxxx --force <gene.sql
mysql已正常,除了几个损坏表的数据,其他数据都导入完成。几个损坏的表结构也创建好,是空表。
8.修复数据
通过加载表空间,把损坏表的ibd加载
use gene;
alter table log_tab DISCARD TABLESPACE;
拷贝回.ibd文件,授权chown和chmod
alter table log_tab IMPORT TABLESPACE;
有的表数据,重新加载后,能select访问,有的表select出错,继续修复
repair table log_tab;
optimize table log_tab;
analyze table log_tab;
能完全修复,或者能select * from log_tab limit N;等大部分数据
有个别表经过上面步骤依旧无法恢复,最后通过开源工具undrop-for-innodb读取ibd数据,load file 导入。
知识点总结
1.innodb_force_recovery的运用
2.mysql的逻辑导出导入,mysqldump的运用
3.mysql实例的新建,初始化操作
4.mysql表空间的DISCARD和IMPORT 运用




