Email: service@parnassusdata.com 7 x 24 online support!
MySQL Recover InnoDB dictionary
Why do we need to recover InnoDB dictionary
c_parser is a tool from TwinDB recovery toolkit that can read InnoDB page and fetch records out of it. Although it can scan any stream of bytes recovery quality is higher when you feed c_parser with pages that belong to the PRIMARY index of the table. All InnoDB indexes have their identifiers a.k.a. index_id. The InnoDB dictionary stores correspondence between table name and index_id. That would be reason number one.
Another reason – it is possible to recover table structure from the InnoDB dictionary. When a table is dropped MySQL deletes respective .frm file. If you had neither backups nor table schema it becomes quite a challenge to recover the table structure. This topic however deserves a separate post which I write some other day.
Let’s assume you’re convinced enough and we can proceed with InnoDB dictionary recovery.
Compiling TwinDB recovery toolkit
The source code of the toolkit is hosted on GitHub. You will need git to get the latest revision, so make sure you have it:
# which git
/usr/bin/git
Get the latest revision of the toolkit:
[root@twindb-dev tmp]# git clone https://github.com/twindb/undrop-for-innodb.git
[root@twindb-dev tmp]# cd undrop-for-innodb/
[root@twindb-dev undrop-for-innodb]# ll
total 136
-rw-r--r-- 1 root root 6271 Jun 24 00:41 check_data.c
-rw-r--r-- 1 root root 27516 Jun 24 00:41 c_parser.c
drwxr-xr-x 2 root root 4096 Jun 24 00:41 dictionary
drwxr-xr-x 2 root root 4096 Jun 24 00:41 include
-rw-r--r-- 1 root root 1203 Jun 24 00:41 Makefile
-rw-r--r-- 1 root root 15495 Jun 24 00:41 print_data.c
drwxr-xr-x 2 root root 4096 Jun 24 00:41 sakila
-rw-r--r-- 1 root root 5223 Jun 24 00:41 sql_parser.l
-rw-r--r-- 1 root root 21137 Jun 24 00:41 sql_parser.y
-rw-r--r-- 1 root root 22236 Jun 24 00:41 stream_parser.c
-rw-r--r-- 1 root root 2237 Jun 24 00:41 tables_dict.c
-rwxr-xr-x 1 root root 6069 Jun 24 00:41 test.sh
[root@twindb-dev undrop-for-innodb]#
As prerequisites we would need gcc, flex and bison. Check that you have them:
[root@twindb-dev undrop-for-innodb]# which gcc
/usr/bin/gcc
[root@twindb-dev undrop-for-innodb]# which bison
/usr/bin/bison
[root@twindb-dev undrop-for-innodb]# which flex
/usr/bin/flex
Good. Now let’s compile the code:
[root@twindb-dev undrop-for-innodb]# make
gcc -g -O3 -I./include -c stream_parser.c
gcc -g -O3 -I./include -pthread -lm stream_parser.o -o stream_parser
#flex -d sql_parser.l
flex sql_parser.l
#bison -r all -o sql_parser.c sql_parser.y
bison -o sql_parser.c sql_parser.y
sql_parser.y: conflicts: 5 shift/reduce
gcc -g -O3 -I./include -c sql_parser.c
gcc -g -O3 -I./include -c c_parser.c
gcc -g -O3 -I./include -c tables_dict.c
gcc -g -O3 -I./include -c print_data.c
gcc -g -O3 -I./include -c check_data.c
gcc -g -O3 -I./include -pthread -lm sql_parser.o c_parser.o tables_dict.o print_data.o check_data.o -o c_parser
[root@twindb-dev undrop-for-innodb]#
If there are no errors we are ready to proceed.
Splitting ibdata1
The InnoDB dictionary is stored in ibdata1. So we need to parse it and get pages that store records of the dictionary. stream_parser does it.
# ./stream_parser -f /var/lib/mysql/ibdata1
...
Size to process: 79691776 (76.000 MiB)
All workers finished in 1 sec
stream_parser finds InnoDB pages in ibdata1 and stores them sorted by page type(FIL_PAGE_INDEX or FIL_PAGE_TYPE_BLOB) by index_id.
Here’s the indexes:
SYS_TABLES
[root@twindb-dev undrop-for-innodb]# ll pages-ibdata1/FIL_PAGE_INDEX/0000000000000001.page
-rw-r--r-- 1 root root 16384 Jun 24 00:50 pages-ibdata1/FIL_PAGE_INDEX/0000000000000001.page
SYS_INDEXES
[root@twindb-dev undrop-for-innodb]# ll pages-ibdata1/FIL_PAGE_INDEX/0000000000000003.page
-rw-r--r-- 1 root root 16384 Jun 24 00:50 pages-ibdata1/FIL_PAGE_INDEX/0000000000000003.page
SYS_COLUMNS
[root@twindb-dev undrop-for-innodb]# ll pages-ibdata1/FIL_PAGE_INDEX/0000000000000002.page
-rw-r--r-- 1 root root 49152 Jun 24 00:50 pages-ibdata1/FIL_PAGE_INDEX/0000000000000002.page
and SYS_FIELDS
[root@twindb-dev undrop-for-innodb]# ll pages-ibdata1/FIL_PAGE_INDEX/0000000000000004.page
-rw-r--r-- 1 root root 16384 Jun 24 00:50 pages-ibdata1/FIL_PAGE_INDEX/0000000000000004.page
As you can see the dictionary is pretty small, just one page per index.
Dumping records from SYS_TABLES and SYS_INDEXES
To fetch records out of the index pages you need c_parser. But first, let’s create directory for dumps
[root@twindb-dev undrop-for-innodb]# mkdir -p dumps/default
[root@twindb-dev undrop-for-innodb]#
InnoDB dictionary is always in REDUNDANT format, so options -4 is mandatory:
[root@twindb-dev undrop-for-innodb]# ./c_parser -4f pages-ibdata1/FIL_PAGE_INDEX/0000000000000001.page -t dictionary/SYS_TABLES.sql > dumps/default/SYS_TABLES 2> dumps/default/SYS_TABLES.sql
[root@twindb-dev undrop-for-innodb]#
Here’s our sakila tables:
[root@twindb-dev undrop-for-innodb]# grep sakila dumps/default/SYS_TABLES | head -5
0000000052D5 D9000002380110 SYS_TABLES "sakila/actor" 753 4 1 0 80 "" 739
0000000052D8 DC0000014F0110 SYS_TABLES "sakila/address" 754 8 1 0 80 "" 740
0000000052DB DF000002CA0110 SYS_TABLES "sakila/category" 755 3 1 0 80 "" 741
0000000052DE E2000002F80110 SYS_TABLES "sakila/city" 756 4 1 0 80 "" 742
0000000052E1 E5000002C50110 SYS_TABLES "sakila/country" 757 3 1 0 80 "" 743
[root@twindb-dev undrop-for-innodb]#
dumps/default/SYS_TABLES is a dump of the table eligible for LOAD DATA INFILE command. The exact command c_parsers prints to standard error output. I saved it in dumps/default/SYS_TABLES.sql
[root@twindb-dev undrop-for-innodb]# cat dumps/default/SYS_TABLES.sql
SET FOREIGN_KEY_CHECKS=0;
LOAD DATA INFILE '/root/tmp/undrop-for-innodb/dumps/default/SYS_TABLES' REPLACE INTO TABLE `SYS_TABLES` FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"' LINES STARTING BY 'SYS_TABLES\t' (`NAME`, `ID`, `N_COLS`, `TYPE`, `MIX_ID`, `MIX_LEN`, `CLUSTER_NAME`, `SPACE`);
[root@twindb-dev undrop-for-innodb]#
The same way let’s dump SYS_INDEXES:
[root@twindb-dev undrop-for-innodb]# ./c_parser -4f pages-ibdata1/FIL_PAGE_INDEX/0000000000000003.page -t dictionary/SYS_INDEXES.sql > dumps/default/SYS_INDEXES 2> dumps/default/SYS_INDEXES.sql
[root@twindb-dev undrop-for-innodb]#
Make sure we have sane result in the dumps
[root@twindb-dev undrop-for-innodb]# head -5 dumps/default/SYS_INDEXES
-- Page id: 11, Format: REDUNDANT, Records list: Valid, Expected records: (153 153)
000000000300 800000012D0177 SYS_INDEXES 11 11 "ID\_IND" 1 3 0 302
000000000300 800000012D01A5 SYS_INDEXES 11 12 "FOR\_IND" 1 0 0 303
000000000300 800000012D01D3 SYS_INDEXES 11 13 "REF\_IND" 1 0 0 304
000000000300 800000012D026D SYS_INDEXES 12 14 "ID\_IND" 2 3 0 305
[root@twindb-dev undrop-for-innodb]# head -5 dumps/default/SYS_INDEXES.sql
SET FOREIGN_KEY_CHECKS=0;
LOAD DATA INFILE '/root/tmp/undrop-for-innodb/dumps/default/SYS_INDEXES' REPLACE INTO TABLE `SYS_INDEXES` FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"' LINES STARTING BY 'SYS_INDEXES\t' (`TABLE_ID`, `ID`, `NAME`, `N_FIELDS`, `TYPE`, `SPACE`, `PAGE_NO`);
[root@twindb-dev undrop-for-innodb]#
Now we can work with the dictionary, but it’s more convenient if the tables are in MySQL.
Loading dictionary tables into MySQL
The main usage of SYS_TABLES and SYS_INDEXES is to get index_id by table name. It’s possible to run two greps. Having SYS_TABLES and SYS_INDEXES in MySQL makes job easier.
Before we can process let’s make sure mysql user can read from the root’s home directory. Maybe it’s not wise from security standpoint. If it’s your concern create whole recovery environment somewhere in /tmp.
[root@twindb-dev undrop-for-innodb]# chmod 711 /root/
[root@twindb-dev undrop-for-innodb]#
Create empty dictionary tables in some database(e.g. test)
[root@twindb-dev undrop-for-innodb]# mysql test < dictionary/SYS_TABLES.sql
[root@twindb-dev undrop-for-innodb]# mysql test < dictionary/SYS_INDEXES.sql
[root@twindb-dev undrop-for-innodb]#
And load the dumps:
[root@twindb-dev undrop-for-innodb]# mysql test < dumps/default/SYS_TABLES.sql
[root@twindb-dev undrop-for-innodb]# mysql test < dumps/default/SYS_INDEXES.sql
[root@twindb-dev undrop-for-innodb]#
Now we have the InnoDB dictionary in MySQL and we can query it as any other MySQL table:
mysql> SELECT * FROM SYS_TABLES WHERE NAME = 'sakila/actor';
+--------------+-----+--------+------+--------+---------+--------------+-------+
| NAME | ID | N_COLS | TYPE | MIX_ID | MIX_LEN | CLUSTER_NAME | SPACE |
+--------------+-----+--------+------+--------+---------+--------------+-------+
| sakila/actor | 753 | 4 | 1 | 0 | 80 | | 739 |
+--------------+-----+--------+------+--------+---------+--------------+-------+
1 row in set (0.00 sec)
mysql> SELECT * FROM SYS_INDEXES WHERE TABLE_ID = 753;
+----------+------+---------------------+----------+------+-------+---------+
| TABLE_ID | ID | NAME | N_FIELDS | TYPE | SPACE | PAGE_NO |
+----------+------+---------------------+----------+------+-------+---------+
| 753 | 1828 | PRIMARY | 1 | 3 | 739 | 3 |
| 753 | 1829 | idx_actor_last_name | 1 | 0 | 739 | 4 |
+----------+------+---------------------+----------+------+-------+---------+
2 rows in set (0.00 sec)
Here we can see that sakila.actor has two indexes: PRIMARY and idx_actor_last_name. Respective index_id are 1828 and 1829.
Stay tuned to learn what to do with them and how to recover sakila.actor