暂无图片
暂无图片
3
暂无图片
暂无图片
2
暂无图片

[MYSQL] MY-012095 表空间ID不对

原创 大大刺猬 2024-09-13
486

导读

有时候需要使用mysql的数据文件做恢复, 或者其它某些异常导致mysqld启动的时候报错如下:

2024-09-13T02:20:33.489262Z 0 [Warning] [MY-013244] [Server] --collation-server: 'utf8_general_ci' is a collation of the deprecated character set UTF8MB3. Please consider using UTF8MB4 with an appropriate collation instead. 2024-09-13T02:20:33.495902Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started. 2024-09-13T02:20:33.832733Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended. 2024-09-13T02:20:33.919585Z 0 [ERROR] [MY-012095] [InnoDB] [FATAL] Tablespace id is 2 in the data dictionary but in file ./db1/t20240912.ibd it is 49591! 2024-09-13T02:20:33.919622Z 0 [ERROR] [MY-013183] [InnoDB] Assertion failure: fil0fil.cc:2582:ib::fatal triggered thread 140154526787328 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.html InnoDB: about forcing recovery. 02:20:33 UTC - mysqld got signal 6 ; Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware. Thread pointer: 0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0 thread_stack 0x100000 /soft/mysql_3306/mysqlbase/mysql/bin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x2e) [0x1f9ffde] /soft/mysql_3306/mysqlbase/mysql/bin/mysqld(print_fatal_signal(int)+0x2eb) [0x103628b] /soft/mysql_3306/mysqlbase/mysql/bin/mysqld(my_server_abort()+0x5e) [0x103638e] /soft/mysql_3306/mysqlbase/mysql/bin/mysqld(my_abort()+0xa) [0x1f9a3aa] /soft/mysql_3306/mysqlbase/mysql/bin/mysqld(ut_dbg_assertion_failed(char const*, char const*, unsigned long)+0x30c) [0x225c4bc] /soft/mysql_3306/mysqlbase/mysql/bin/mysqld(ib::fatal::~fatal()+0x98) [0x225ee18] /soft/mysql_3306/mysqlbase/mysql/bin/mysqld(Fil_shard::get_file_size(fil_node_t*, bool)+0x3fe) [0x239df2e] /soft/mysql_3306/mysqlbase/mysql/bin/mysqld(Fil_shard::open_file(fil_node_t*)+0x294) [0x23ad124] /soft/mysql_3306/mysqlbase/mysql/bin/mysqld(Fil_shard::prepare_file_for_io(fil_node_t*)+0x33) [0x23adfd3] /soft/mysql_3306/mysqlbase/mysql/bin/mysqld(Fil_shard::do_io(IORequest const&, bool, page_id_t const&, page_size_t const&, unsigned long, unsigned long, void*, void*)+0x1a9) [0x23aeb69] /soft/mysql_3306/mysqlbase/mysql/bin/mysqld(fil_io(IORequest const&, bool, page_id_t const&, page_size_t const&, unsigned long, unsigned long, void*, void*)+0x57) [0x23af5f7] /soft/mysql_3306/mysqlbase/mysql/bin/mysqld(buf_read_page_low(dberr_t*, bool, unsigned long, unsigned long, page_id_t const&, page_size_t const&, bool)+0x13e) [0x22ed5be] /soft/mysql_3306/mysqlbase/mysql/bin/mysqld(buf_read_page_background(page_id_t const&, page_size_t const&, bool)+0x26) [0x22ed966] /soft/mysql_3306/mysqlbase/mysql/bin/mysqld() [0x22d1503] /soft/mysql_3306/mysqlbase/mysql/bin/mysqld(buf_dump_thread()+0xe5) [0x22d1c55] /soft/mysql_3306/mysqlbase/mysql/bin/mysqld(std::thread::_State_impl<std::thread::_Invoker<std::tuple<Detached_thread, void (*)()> > >::_M_run()+0xbc) [0x212061c] /soft/mysql_3306/mysqlbase/mysql/bin/mysqld() [0x279e1ef]
复制

分析

根据描述来看是 db1/t20240912.ibd 数据字典里面记录的Tablespace id是2, 但实际文件记录的是49591. 然后就挂了
image.png
既然是表空间id不一致, 那解决办法就至少有3种了.

1: 修改数据字典

2: 修改ibd文件里面的表空间id

3: 使用alter table xxx import tablespace的方式导入表

基于我们之前修改 lower_case_table_names 的经验来看, 修改ibd文件里面的信息更实际一点-_-

虽然之前讲过ibd的结构, 但这里再来回顾一下
image.png

也就是每页的第34-38字节 和 第1页的38-42 字节记录了表空间ID. 我们只需要修改这里就行.(记得同步校验字段)

验证

方法1(推荐)

我们先使用import tablespace的方式导入, 该方式需要先discard表空间, 所以记得先备份表空间文件, 不然discard了就没得了. 当然现在数据库都起不来了, 我们直接mv走就是了

mv /data/mysql_3306/mysqldata/db1/t20240912.ibd /tmp systemctl start mysqld_3306
复制

应该会在日志里面发现如下Warning, 但我们并不在意它

2024-09-13T06:23:30.463261Z 1 [Warning] [MY-012351] [InnoDB] Tablespace 2, name 'db1/t20240912', file './db1/t20240912.ibd' is missing!
复制

然后我们再拷贝回去并导入数据库

SHELL> mv /tmp/t20240912.ibd /data/mysql_3306/mysqldata/db1 SQL> alter table db1.t20240912 import tablespace;
复制

image.png

方法2

方法1看起来没毛病, 我们先使用脚本看下t20240912.ibd的表空间id是多少(2还是49591呢?) 脚本放在文末
image.png
好家伙, 居然给我修改为了2… 汽油磁力!

我们也来使用脚本来修改ibd文件里面的表空间id为2吧. (先回退快照,或者重新模拟下故障)
image.png

systemctl stop mysqld_3306 rm -rf /data/mysql_3306/mysqldata/db1/t20240912.ibd python mysql_replace_tablespaceid.py t20240912.ibd 2 /data/mysql_3306/mysqldata/db1/t20240912.ibd chown mysql:mysql /data/mysql_3306/mysqldata/db1/t20240912.ibd systemctl start mysqld_3306
复制

image.png

看起来没报错, 我们登录数据库验证下呢
image.png
getimgdata.gif

说:该ibd文件的事务比我们系统的事务要高… (使用旧的ibd文件在新环境恢复的事情被发现了). 官方竟然给我们来了这么一手. 汽油磁力!*2

error日志里面也是类似的信息(但更具体了):

2024-09-13T06:41:20.444460Z 8 [ERROR] [MY-011971] [InnoDB] Tablespace 'db1/t20240912' Page [page id: space=2, page number=4] log sequence number 113002135714 is in the future! Current system log sequence number 18335077. 2024-09-13T06:41:20.444507Z 8 [ERROR] [MY-011972] [InnoDB] Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to http://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.html for information about forcing recovery. 2024-09-13T06:41:20.444523Z 8 [ERROR] [MY-011971] [InnoDB] Tablespace 'db1/t20240912' Page [page id: space=2, page number=1] log sequence number 113002135714 is in the future! Current system log sequence number 18335077. 2024-09-13T06:41:20.444533Z 8 [ERROR] [MY-011972] [InnoDB] Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to http://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.html for information about forcing recovery.
复制

理论上通过拷贝redo,ibdata等文件是可以解决这个问题的. 但实际环境哪来那么多东西呢.(当然也可以使用ibd2sql之类的工具直接将ibd文件转为SQL语句). 也可以直接修改ibd文件里面的trx_id, 但这就太复杂了. 有没得简单点的方法呢???

按照官方的设置 innodb_force_recovery应该可以(毕竟可以不看redo了都).

或者就是修改系统的LSN(18335077)为大于等于我们当前值(113002135714). 毕竟ibdata也是page 比如:
image.png
那我们小试一波.

import struct filename = '/data/mysql_3306/mysqldata/ibdata1' filename2 = '/tmp/ibdata1' PAGE_SIZE = 16384 CURRENT_LSN = 18335077 LSN = 113002135714 + 100000 def create_crc32c_table(): poly = 0x82f63b78 table = [] for i in range(256): crc = i for _ in range(8): if crc & 1: crc = (crc >> 1) ^ poly else: crc >>= 1 table.append(crc) return table crc32_slice_table = create_crc32c_table() def calculate_crc32c(data): crc = 0xFFFFFFFF for byte in bytearray(data): # for PY2 crc = crc32_slice_table[(crc ^ byte) & 0xFF] ^ (crc >> 8) return crc ^ 0xFFFFFFFF def replace_crc32(data): c1 = calculate_crc32c(data[4:26]) c2 = calculate_crc32c(data[38:PAGE_SIZE-8]) cb = struct.pack('>L',(c1^c2)&(2**32-1)) data = cb + data[4:PAGE_SIZE-8] + cb + data[PAGE_SIZE-4:] return data def find_xx_positions(s,x): positions = [] xl = len(x) start = 0 while True: pos = s.find(x, start) if pos == -1: break positions.append(pos) start = pos + xl return positions LSN_OLD = struct.pack('>Q',CURRENT_LSN) LSN_NEW = struct.pack('>Q',LSN) f2 = open(filename2,'wb') with open(filename, 'rb') as f: while True: data = f.read(PAGE_SIZE) if data == b'': break for x in find_xx_positions(data,LSN_OLD): data = data[:x] + LSN_NEW + data[x+8:] data = replace_crc32(data) f2.write(data) f2.flush() f2.close()
复制

可惜, 不行. 虽然报错看起来像是坏块, 但检查了不是坏块, 这回应该是内部的数据存在校验了. 或者是ibdata1根本没记录LSN, lsn只记录在redo里面
image.png
image.png

LSN修改

既然改不了系统的, 那就还是来修改我们自己的ibd文件的LSN咯… 根据上面的图, 我们知道1个page里面有多个lsn
image.png
那我们直接将lsn修改为0吧. 再改改代码
getimgdata2.gif

修改之后查询数据,并查看日志信息如下: LSN问题确实没得了, 但trx问题还是没处理啊…
image.png
image.png

虽然我们也可以修改trx_id, 但这涉及到行数据的解析了, 比较麻烦. 还是老老实实的使用import tablespace吧.

总结

数据恢复的时候还是先使用官方的命令,(不但帮忙修改了tablespace id, 甚至还修改了lsn&trx?). 如果尝试手动修改tablespace信息的话, 还需要考虑LSNTRX ID之类的问题.老麻烦了
image.png

附源码

这里附的是修改ibd文件的tablespace id和lsn的源码, github上那个没得lsn的修改的.(也就两三行代码的差距)

#!/usr/bin/env python # -*- coding: utf-8 -*- # write by ddcw # innodb 表空间中的 space_id的替换 # 表空间id位于34-38 大端字节序 PAGE_SIZE = 16384 import sys,struct,os args = len(sys.argv) if args == 2: with open(sys.argv[1],'rb') as f: data = f.read(PAGE_SIZE) space_id = struct.unpack('>L',data[34:38])[0] msg = "TABLESPACE ID: " + str(space_id) + '\n' sys.stdout.write(msg) sys.exit() elif args == 4: filename = sys.argv[1] space_id = int(sys.argv[2]) filename2 = sys.argv[3] if not os.path.exists(filename): msg = filename + " IS NOT EXISTS.\n" sys.stdout.write(msg) sys.exit(1) elif int(os.stat(filename).st_size % PAGE_SIZE) != 0: msg = filename + " Maybe not mysql's ibd file\n" sys.stdout.write(msg) sys.exit(2) if os.path.exists(filename2): msg = filename2 + " IS EXISTS. Please rename it\n" sys.stdout.write(msg) sys.exit(3) else: msg = "\nExample: \npython " + sys.argv[0] + " test.ibd\npython " + sys.argv[0] + " test.ibd 123456 new_test.ibd\n\n" sys.stdout.write(msg) sys.exit(4) def create_crc32c_table(): poly = 0x82f63b78 table = [] for i in range(256): crc = i for _ in range(8): if crc & 1: crc = (crc >> 1) ^ poly else: crc >>= 1 table.append(crc) return table crc32_slice_table = create_crc32c_table() def calculate_crc32c(data): crc = 0xFFFFFFFF for byte in bytearray(data): # for PY2 crc = crc32_slice_table[(crc ^ byte) & 0xFF] ^ (crc >> 8) return crc ^ 0xFFFFFFFF def replace_lsn(data): return data[:16] + struct.pack('>Q',0) + data[24:PAGE_SIZE-4:] + struct.pack('>L',0) def replace_crc32(data): data = replace_lsn(data) c1 = calculate_crc32c(data[4:26]) c2 = calculate_crc32c(data[38:PAGE_SIZE-8]) cb = struct.pack('>L',(c1^c2)&(2**32-1)) data = cb + data[4:PAGE_SIZE-8] + cb + data[PAGE_SIZE-4:] return data f2 = open(filename2,'wb') SPACE_ID = struct.pack('>L',space_id) with open(filename, 'rb') as f: # FSP 38-42 (SPACE_HEADER:4 is SPACE ID) data = f.read(PAGE_SIZE) data = data[:34] + SPACE_ID + SPACE_ID + data[42:] data = replace_crc32(data) f2.write(data) while True: data = f.read(PAGE_SIZE) if data == b'': break if data[34:38] != b'\x00\x00\x00\x00': data = replace_crc32(data[:34] + SPACE_ID + data[38:]) f2.write(data) f2.flush() f2.close() msg = 'Write to filename: ' + filename2 + '\n' sys.stdout.write(msg)
复制
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

JZ
暂无图片
7月前
评论
暂无图片 0
黄海哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈还是黄海和糊糊涂涂 ,天兔h'hutututituihhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhq
7月前
暂无图片 点赞
评论
锁钥
暂无图片
7月前
评论
暂无图片 0
[MYSQL] MY-012095 表空间ID不对
7月前
暂无图片 点赞
评论