Email: service@parnassusdata.com 7 x 24 online support!
How to handle wrong page type in external pages
First step of successful MySQL data recovery is to find InnoDB pages with your data. Let’s call it first, because prerequisite steps are already done.
InnoDB page type is a two bytes integer stored in the header of a page. For MySQL data recovery two are important:
FIL_PAGE_INDEX. Pages of this type are nodes of B+ Tree index where InnoDB stores a table.
FIL_PAGE_TYPE_BLOB. So called external pages, where InnoDB keeps long values of BLOB or TEXT type.
stream_parser reads a stream of bytes, finds InnoDB pages and sorts them per type, per index or page id. It applies sophisticated algorithms tailored for particular page type. Of course, it assumes that page type in the header corresponds to the content of the page, otherwise it will ignore the page.
Recently I worked on a data recovery case that proved I was wrong in my assumptions. The customer dropped their database. They ran MySQL 5.0 with innodb_file_per_table=OFF. This is one of the easiest recovery scenarios, however not everything was recovered – the most important table missed BLOB fields. Total number of recover was close to the true value, but BLOB fields were truncated. Excessive “— #####CannotOpen_FIL_PAGE_TYPE_BLOB/0000000000000XYZ.page”” errors in the standard error output proved that stream_parser failed to find external pages.
Something went wrong, the data couldn’t be overwritten, the customer stopped MySQL immediately after the accident. I decided to investigate why external pages were not found. Page_id is a file offset in 16k units. dd can extract particular page X:
# dd if=/var/lib/mysql/ibdata1 of=page-8 bs=16k count=1 skip=8
1+0 records in
1+0 records out
16384 bytes (16 kB) copied, 0.00175834 s, 9.3 MB/s
Here’s the header of page id 8, which is index page:
# hexdump -C page-8 | head
00000000 9a 8f cd fc 00 00 00 08 ff ff ff ff ff ff ff ff |................|
00000010 00 00 00 01 00 1b 3a 1c <strong>45 bf</strong> 00 00 00 00 00 00 |......:.E.......|
00000020 00 00 00 00 00 00 00 02 00 b9 00 04 00 00 00 00 |................|
00000030 00 9e 00 02 00 01 00 02 00 00 00 00 00 00 00 00 |................|
00000040 00 01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 |................|
00000050 00 02 03 f2 00 00 00 00 00 00 00 02 03 32 08 01 |.............2..|
00000060 00 00 03 00 85 69 6e 66 69 6d 75 6d 00 09 03 00 |.....infimum....|
00000070 08 03 00 00 73 75 70 72 65 6d 75 6d 00 11 0d 10 |....supremum....|
00000080 00 10 05 00 9e 53 59 53 5f 44 41 54 41 46 49 4c |.....SYS_DATAFIL|
00000090 45 53 00 00 01 a3 1b 17 00 00 18 05 00 74 73 74 |ES...........tst|
FIL_PAGE_INDEX constant is defined as 17855 in MySQL. In hexadecimal it’s 0x45BF. It’s at position 0x18 in the example above.
And here’s an example of the external page:
# hexdump -C 0000000000001414.page | head
00000000 0a 4e 9d 5c 00 00 05 86 00 00 00 00 00 00 00 00 |.N.\............|
00000010 00 00 00 00 00 6b 72 b9 <strong>00 0a</strong> 00 00 00 00 00 00 |.....kr.........|
00000020 00 00 00 00 00 00 00 00 22 42 ff ff ff ff 22 30 |........"B...."0|
00000030 22 3e 3c 2f 46 4f 4e 54 3e 3c 2f 50 3e 53 49 5a |"></FONT></P>SIZ|
00000040 45 3d 22 31 22 20 41 4c 49 47 4e 3d 22 4c 45 46 |E="1" ALIGN="LEF|
00000050 54 22 3e 3c 46 4f 4e 54 20 46 41 43 45 3d 22 56 |T"><FONT FACE="V|
00000060 65 72 64 61 6e 61 22 20 53 49 5a 45 3d 22 31 22 |erdana" SIZE="1"|
00000070 20 43 4f 4c 4f 52 3d 22 23 30 30 30 30 33 33 22 | COLOR="#000033"|
00000080 20 4c 45 54 54 45 52 53 50 41 43 49 4e 47 3d 22 | LETTERSPACING="|
00000090 30 22 20 4b 45 52 4e 49 4e 47 3d 22 30 22 3e 42 |0" KERNING="0">B|
Page type is 0x0A, as it should be.
When I extracted a page that stream_parser couldn’t find, it became clear why. Page type was 0x45BF ! The page was a BLOB page, but the page type in the header was FIL_PAGE_INDEX.
How can you detect InnoDB page if MySQL lies about its type? I believe the solution of this problem does exist, but for now there is a workaround.
c_parser by default reads external pages from directory specified by -d option:
# ./c_parser
Error: Usage: ./c_parser -4|-5|-6 [-dDV] -f <InnoDB page or dir> -t table.sql [-T N:M] [-b <external pages directory>]
...
-b <dir> -- Directory where external pages can be found. Usually it is pages-XXX/FIL_PAGE_TYPE_BLOB/
To read external pages from a file (e.g. ibdata1) option -i is introduced:
-i -- Read external pages at their offsets from <file>.
After this trick the table was successfully recovered.