咨询微信: dbservice1234 7 x 24 在线支持!

    你在这里

    • You are here:
    • 首页 > 博客 > PDSERVICE的博客 > Urgent Help needed with ASM Header Corruption - Q: When is an ASM disk header is read and updated ?

Urgent Help needed with ASM Header Corruption - Q: When is an ASM disk header is read and updated ?

Urgent Help needed with ASM Header Corruption - Q: When is an ASM disk header is read and updated ?

If you cannot recover the data by yourself, ask Parnassusdata, the professional ORACLE database recovery team for help.

Parnassusdata Software Database Recovery Team

Service Hotline:  +86 13764045638 E-mail: service@parnassusdata.com

 

 

 

 

 

 

The 3rd instance of one of the databases ( 11.2.0.3 with ASM + External Redundancy  ) crashed out with the below errors reported ...It seems Customer added some Disks to ASM and Midway rebalance ASM picked up some underlying corruptions subsequently dismounting ASM DG and hence crashing the database

ASM Alert entries

>>> Customer added some Disks here >>>

Thu Oct 25 14:16:09 2012
NOTE: disk validation pending for group 19/0x6bc90d3b (DBTCSTRNPA)
SUCCESS: validated disks for 19/0x6bc90d3b (DBTCSTRNPA)
NOTE: disk validation pending for group 19/0x6bc90d3b (DBTCSTRNPA)
NOTE: Assigning number (19,20) to disk (ORCL:DBTCSTRNPA21)
NOTE: Assigning number (19,21) to disk (ORCL:DBTCSTRNPA22)
NOTE: Assigning number (19,22) to disk (ORCL:DBTCSTRNPA23)

>> Rebalance started 14:29 PM as a result >>>

Thu Oct 25 14:29:02 2012
NOTE: Attempting voting file refresh on diskgroup DATCSTRNPA
NOTE: ASM did background COD recovery for group 10/0x6b190d32 (DATCSTRNPA)
NOTE: starting rebalance of group 10/0x6b190d32 (DATCSTRNPA) at power 1
Starting background process ARB0
Thu Oct 25 14:29:02 2012
ARB0 started with pid=45, OS id=11888 
NOTE: assigning ARB0 to group 10/0x6b190d32 (DATCSTRNPA) with 1 parallel I/O

>>> ASM Header corruption notes Midway during Rebalance at 15:40 >>

Thu Oct 25 15:40:24 2012
WARNING: cache read  a corrupt block: group=10(DATCSTRNPA) dsk=72 blk=48 disk=72 (DATCSTRNPA84) incarn=3916037734 au=0 blk=48 count=1
Errors in file /oracle/diag/asm/+asm/+ASM3/trace/+ASM3_arb0_11888.trc:
ORA-15196: invalid ASM block header [kfc.c:26076] [endian_kfbh] [2147483720] [48] [0 != 1]
NOTE: a corrupted block from group DATCSTRNPA was dumped to /oracle/diag/asm/+asm/+ASM3/trace/+ASM3_arb0_11888.trc
WARNING: cache read (retry) a corrupt block: group=10(DATCSTRNPA) dsk=72 blk=48 disk=72 (DATCSTRNPA84) incarn=3916037734 au=0 blk=48 count=1
Errors in file /oracle/diag/asm/+asm/+ASM3/trace/+ASM3_arb0_11888.trc:
ORA-15196: invalid ASM block header [kfc.c:26076] [endian_kfbh] [2147483720] [48] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26076] [endian_kfbh] [2147483720] [48] [0 != 1]
ERROR: cache failed to read group=10(DATCSTRNPA) dsk=72 blk=48 from disk(s): 72(DATCSTRNPA84)
ORA-15196: invalid ASM block header [kfc.c:26076] [endian_kfbh] [2147483720] [48] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26076] [endian_kfbh] [2147483720] [48] [0 != 1]
NOTE: cache initiating offline of disk 72 group DATCSTRNPA
NOTE: process _arb0_+asm3 (11888) initiating offline of disk 72.3916037734 (DATCSTRNPA84) with mask 0x7e in group 10
WARNING: Disk 72 (DATCSTRNPA84) in group 10 in mode 0x7f is now being taken offline on ASM inst 3
NOTE: initiating PST update: grp = 10, dsk = 72/0xe969fe66, mask = 0x6a, op = clear
Thu Oct 25 15:40:25 2012
GMON updating disk modes for group 10 at 115 for pid 45, osid 11888
ERROR: Disk 72 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 10)
Thu Oct 25 15:40:25 2012
NOTE: cache dismounting (not clean) group 10/0x6B190D32 (DATCSTRNPA) 
WARNING: Offline of disk 72 (DATCSTRNPA84) in group 10 and mode 0x7f failed on ASM inst 3
Thu Oct 25 15:40:25 2012
NOTE: halting all I/Os to diskgroup 10 (DATCSTRNPA)
NOTE: messaging CKPT to quiesce pins Unix process pid: 4739, image: oracle@itcccl180.it.express.tnt (B000)
Thu Oct 25 15:40:25 2012
NOTE: LGWR doing non-clean dismount of group 10 (DATCSTRNPA)
NOTE: LGWR sync ABA=231.134 last written ABA 231.134

>> Diskgroup Dismounted as a Result of this >>>

NOTE: cache dismounted group 10/0x6B190D32 (DATCSTRNPA) 
SQL> alter diskgroup DATCSTRNPA dismount force /* ASM SERVER */ 
System State dumped to trace file /oracle/diag/asm/+asm/+ASM3/trace/+ASM3_arb0_11888.trc
Thu Oct 25 15:40:27 2012
ERROR: ORA-15130 in COD recovery for diskgroup 10/0x6b190d32 (DATCSTRNPA)
ERROR: ORA-15130 thrown in RBAL for group number 10
Errors in file /oracle/diag/asm/+asm/+ASM3/trace/+ASM3_rbal_16047.trc:
ORA-15130: diskgroup "DATCSTRNPA" is being dismounted
ERROR: ORA-15130 in COD recovery for diskgroup 10/0x6b190d32 (DATCSTRNPA)
ERROR: ORA-15130 thrown in RBAL for group number 10
Errors in file /oracle/diag/asm/+asm/+ASM3/trace/+ASM3_rbal_16047.trc:
ORA-15130: diskgroup "DATCSTRNPA" is being dismounted
ERROR: ORA-15130 in COD recovery for diskgroup 10/0x6b190d32 (DATCSTRNPA)
ERROR: ORA-15130 thrown in RBAL for group number 10
Errors in file /oracle/diag/asm/+asm/+ASM3/trace/+ASM3_rbal_16047.trc:
ORA-15130: diskgroup "DATCSTRNPA" is being dismounted
ERROR: ORA-15130 in COD recovery for diskgroup 10/0x6b190d32 (DATCSTRNPA)
ERROR: ORA-15130 thrown in RBAL for group number 10
Errors in file /oracle/diag/asm/+asm/+ASM3/trace/+ASM3_rbal_16047.trc:
ORA-15130: diskgroup "DATCSTRNPA" is being dismounted
Thu Oct 25 15:40:39 2012

Thu Oct 25 15:40:39 2012
NOTE: AMDU dump of disk group DATCSTRNPA created at /oracle/diag/asm/+asm/+ASM3/trace
NOTE: cache deleting context for group DATCSTRNPA 10/0x6b190d32
ERROR: ORA-15130 thrown in ARB0 for group number 10
Errors in file /oracle/diag/asm/+asm/+ASM3/trace/+ASM3_arb0_11888.trc:
ORA-15130: diskgroup "" is being dismounted
ORA-15130: diskgroup "" is being dismounted
ORA-15196: invalid ASM block header [kfc.c:19572] [check_kfbh] [4] [2] [27016521 != 27015521]
ORA-15196: invalid ASM block header [kfc.c:19572] [check_kfbh] [339] [2147483706[4232823222 != 261167758]
ORA-15196: invalid ASM block header [kfc.c:19572] [check_kfbh] [2147483649] [81] [2397242929 != 2383392830]
ORA-15130: diskgroup "DATCSTRNPA" is being dismounted
ORA-15066: offlining disk "DATCSTRNPA84" in group "DATCSTRNPA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26076] [endian_kfbh] [2147483720] [48] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26076] [endian_kfbh] [2147483720] [48] [0 != 1]
Thu Oct 25 15:40:39 2012

NOTE: stopping process ARB0
NOTE: rebalance interrupted for group 10/0x6b190d32 (DATCSTRNPA)
ERROR: ORA-15130 in COD recovery for diskgroup 10/0x6b190d32 (DATCSTRNPA)
ERROR: ORA-15130 thrown in RBAL for group number 10
Errors in file /oracle/diag/asm/+asm/+ASM3/trace/+ASM3_rbal_16047.trc:
ORA-15130: diskgroup "" is being dismounted

DB Alert log has these entries 

Thu Oct 25 02:22:42 2012
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
...
..
Thu Oct 25 02:52:58 2012
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM

>> CAN Be ignored as documented under "WARNING: ASM Communication Error: Op 0 State 0x0 (15055) (Doc ID 1469167.1)"

...
...

>>> ASM Disks added are reported here >>>

Thu Oct 25 14:16:18 2012
SUCCESS: disk DBTCSTRNPA21 (20.3916037823) added to diskgroup DBTCSTRNPA
SUCCESS: disk DBTCSTRNPA22 (21.3916037824) added to diskgroup DBTCSTRNPA
SUCCESS: disk DBTCSTRNPA23 (22.3916037825) added to diskgroup DBTCSTRNPA
Thu Oct 25 14:22:40 2012

>>> DB Crashes as the ASM Diskgroup was dismounted due to Corruptions >>>

Thu Oct 25 15:40:39 2012
Errors in file /oracle/diag/rdbms/cstrnpa/CSTRNPA3/trace/CSTRNPA3_lgwr_17853.trc:
ORA-00345: redo log write error block 35172 count 1
ORA-00312: online log 17 thread 3: '+DATCSTRNPA/cstrnpa/onlinelog/group_17.399.776096445'
ORA-15078: ASM diskgroup was forcibly dismounted
ORA-15078: ASM diskgroup was forcibly dismounted
Errors in file /oracle/diag/rdbms/cstrnpa/CSTRNPA3/trace/CSTRNPA3_lgwr_17853.trc:
ORA-00346: log member marked as STALE and closed
ORA-00312: online log 17 thread 3: '+DATCSTRNPA/cstrnpa/onlinelog/group_17.399.776096445'
Thu Oct 25 15:40:48 2012
KCF: read, write or open error, block=0x9b online=1
        file=123 '+DATCSTRNPA/cstrnpa/datafile/undotbs3.387.767891645'
        error=15078 txt: ''
Errors in file /oracle/diag/rdbms/cstrnpa/CSTRNPA3/trace/CSTRNPA3_dbw0_17837.trc:
Errors in file /oracle/diag/rdbms/cstrnpa/CSTRNPA3/trace/CSTRNPA3_dbw0_17837.trc:
ORA-63999: data file suffered media failure
ORA-01114: IO error writing block to file 123 (block # 155)
ORA-01110: data file 123: '+DATCSTRNPA/cstrnpa/datafile/undotbs3.387.767891645'
ORA-15078: ASM diskgroup was forcibly dismounted
ORA-15078: ASM diskgroup was forcibly dismounted
DBW0 (ospid: 17837): terminating the instance due to error 63999

Hardware vendor HP have tried shelving these issues onto Oracle and have asked us to explain exactly when and how is an ASM Disk Header read and Updated so please can anyone help provide answers to below Q's asked =>
They believe ASM Rebalance caused these corruptions but we don't think that was the reason 

1. When ASM rebalances the disks, does it read the block header first then write the block?  Can a ASM Rebalance cause Block corruptions under any circumstances OR is this not possible within the ASM Internal mechanism ?

2. When is the ASM header read ?

3. What causes ASM metadata to be updated ? Is this updated when the disk is added immediately, or when the rebalancing occurs?

4. How is locking done on the ASM header between the RAC nodes and how is a lock released on an Oracle instance failure?

5. Why did the Database carry on when the Header corruption error was first reported in the Alert log  – This has been partially answered in the fact the error is only detected when the rebalance runs. 

6.  How can we determine when was the last successful ASM Header read before the corruption ?
  

Any help would be more than appreciated...

 

 

Answer:

 

ARB0 relocating file +DATCSTRNPA.256.666381297 (8 entries)



*** 2012-10-25 17:05:06.757

ARB0 relocating file +DATCSTRNPA.258.666381295 (76 entries)



*** 2012-10-25 17:07:14.274

WARNING: cache read  a corrupt block: group=10(DATCSTRNPA) dsk=72 blk=48 disk=72 (DATCSTRNPA84) incarn=3916037804 au=0 blk=48 count=1



*** 2012-10-25 17:07:14.274

dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=0, mask=0x0)

----- Error Stack Dump -----
ORA-15196: invalid ASM block header [kfc.c:26076] [endian_kfbh] [2147483720] [48] [0 != 1]
Hex dump of disk block image:

Dump of memory from 0x00000000694FA000 to 0x00000000694FB000

0694FA000 00000000 00000000 00000000 00000000  [................]

        Repeat 63 times

0694FA400 003C0000 00780000 00060000 007571ED  [..<...x......qu.]

0694FA410 003BFFF5 00000000 00000002 00000002  [..;.............]

0694FA420 00008000 00008000 00004000 5051A885  [.........@....QP]

0694FA430 50528195 001D0005 0003EF53 00000001  [..RP....S.......]

0694FA440 5048BD84 00ED4E00 00000000 00000001  [..HP.N..........]

0694FA450 00000000 0000000B 00000080 00000034  [............4...]

0694FA460 00000006 00000003 DB40F439 4643267C  [........9.@.|&CF]

0694FA470 5AB9A4A6 6703FEB5 00000000 00000000  [...Z...g........]

0694FA480 00000000 00000000 00000000 00000000  [................]

        Repeat 3 times

0694FA4C0 00000000 00000000 00000000 03FE0000  [................]

0694FA4D0 00000000 00000000 00000000 00000000  [................]

0694FA4E0 00000008 00000000 00000000 6D4FD3DA  [..............Om]

0694FA4F0 174CB0E0 9DA83EA6 62C7706F 00000102  [..L..>..op.b....]

0694FA500 00000000 00000000 5048BD84 00000609  [..........HP....]

0694FA510 0000060A 0000060B 0000060C 0000060D  [................]

0694FA520 0000060E 0000060F 00000610 00000611  [................]

0694FA530 00000612 00000613 00000614 00000615  [................]

0694FA540 00000A16 00000000 00000000 08000000  [................]

0694FA550 00000000 00000000 00000000 00000000  [................]

  Repeat 170 times

OSM metadata block dump:

kfbh.endian:                          0 ; 0x000: 0x00

kfbh.hard:                            0 ; 0x001: 0x00

kfbh.type:                            0 ; 0x002: KFBTYP_INVALID

kfbh.datfmt:                          0 ; 0x003: 0x00

kfbh.block.blk:                       0 ; 0x004: blk=0

kfbh.block.obj:                       0 ; 0x008: file=0

kfbh.check:                           0 ; 0x00c: 0x00000000

kfbh.fcn.base:                        0 ; 0x010: 0x00000000

kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000

kfbh.spare1:                          0 ; 0x018: 0x00000000

kfbh.spare2:                          0 ; 0x01c: 0x00000000

kfbtTraverseBlock:  Invalid OSM block type 0

WARNING: cache read (retry) a corrupt block: group=10(DATCSTRNPA) dsk=72 blk=48 disk=72 (DATCSTRNPA84) incarn=3916037804 au=0 blk=48 count=1



*** 2012-10-25 17:07:14.277

dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=0, mask=0x0)

----- Error Stack Dump -----
ORA-15196: invalid ASM block header [kfc.c:26076] [endian_kfbh] [2147483720] [48] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26076] [endian_kfbh] [2147483720] [48] [0 != 1]
ERROR: cache failed to read group=10(DATCSTRNPA) dsk=72 blk=48 from disk(s): 72(DATCSTRNPA84)

CE: (0x0x693e9018) group=10 (DATCSTRNPA) dsk=72 blk=48

    hashFlags=0x0000 lid=0x0002 lruFlags=0x0000 bastCount=1

    mirror=0

    flags_kfcpba=0x49 copies=1 blockIndex=48 AUindex=0 AUcount=1 loctr fcn=0.0

    copy #0:  disk=72  au=0 flags=01

BH: (0x0x69791290) bnum=2049 type=reading state=reading chgSt=not modifying pageIn=current

    flags=0x00000000 pinmode=excl lockmode=excl bf=0x694fa000

    kfbh_kfcbh.fcn_kfbh = 0.0 lowAba=0.0 highAba=0.0

    last kfcbInitSlot return code=null chgCount=815 cpkt lnk is null ralFlags=0x00000000

    PINS:

    (kfcbps) pin=25743 get by kfd.c line 23273 mode=excl

             dsk=72 blk=48 status=pinned

             flags=0x80000000 flags2=0x00000000

             class=1400 type=ALLOCTBL stateWanted=current

             bastCount=1 waitStatus=0x00000000 relocCount=0

             scanBastCount=2 scanBxid=64781 scanSkipCode=2

             last released by kfc.c 18264

 LE: (0x724e36b0) le=2567 group=10 dsk=72 blk=48

    open=T kjStat=0 mode=EX closing=0 lop=(nil)

    flags=00000000 astFlags=00000000 rlsFlags=00000000

    rcvFlags=00000000 id=0x2a000048.30 bucket=1791

    lastScanWaiterMode=0 fcn=0.0

    

 File_name :: +ASM2_arb0_23441.trc

 

 NOTE: cache opening disk 71 of grp 10: DATCSTRNPA83 label:DATCSTRNPA83

 NOTE: cache opening disk 72 of grp 10: DATCSTRNPA84 label:DATCSTRNPA84

 NOTE: cache opening disk 73 of grp 10: DATCSTRNPA85 label:DATCSTRNPA85







00003e50  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

*

00004000  01 82 03 01 04 00 00 00  48 00 00 80 aa 4d 6f 80  |........H....Mo.|

00004010  01 b7 8c 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

00004020  80 03 00 00 c0 01 00 00  08 00 08 00 00 00 c0 01  |................|











]$ kfed read mpath16p1dump blknum=47|more

kfbh.endian:                          1 ; 0x000: 0x01

kfbh.hard:                          130 ; 0x001: 0x82

kfbh.type:                            3 ; 0x002: KFBTYP_ALLOCTBL

kfbh.datfmt:                          1 ; 0x003: 0x01

kfbh.block.blk:                      47 ; 0x004: blk=47

kfbh.block.obj:              2147483720 ; 0x008: disk=72

kfbh.check:                  2309435824 ; 0x00c: 0x89a731b0

kfbh.fcn.base:                  9230112 ; 0x010: 0x008cd720

kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000

kfbh.spare1:                          0 ; 0x018: 0x00000000





$ kfed read mpath16p1dump blknum=48|more

kfbh.endian:                          0 ; 0x000: 0x00

kfbh.hard:                            0 ; 0x001: 0x00

kfbh.type:                            0 ; 0x002: KFBTYP_INVALID

kfbh.datfmt:                          0 ; 0x003: 0x00

kfbh.block.blk:                       0 ; 0x004: blk=0

kfbh.block.obj:                       0 ; 0x008: file=0

kfbh.check:                           0 ; 0x00c: 0x00000000

kfbh.fcn.base:                        0 ; 0x010: 0x00000000

kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000

kfbh.spare1:                          0 ; 0x018: 0x00000000

kfbh.spare2:                          0 ; 0x01c: 0x00000000

B7F14200 00000000 00000000 00000000 00000000  [................]

        Repeat 63 times

B7F14600 003C0000 00780000 00060000 007571ED  [..<...x......qu.]

B7F14610 003BFFF5 00000000 00000002 00000002  [..;.............]

B7F14620 00008000 00008000 00004000 5051A885  [.........@....QP]

B7F14630 50528195 001D0005 0003EF53 00000001  [..RP....S.......]

B7F14640 5048BD84 00ED4E00 00000000 00000001  [..HP.N..........]

B7F14650 00000000 0000000B 00000080 00000034  [............4...]

B7F14660 00000006 00000003 DB40F439 4643267C  [........9.@.|&CF]

B7F14670 5AB9A4A6 6703FEB5 00000000 00000000  [...Z...g........]

B7F14680 00000000 00000000 00000000 00000000  [................]

        Repeat 3 times

B7F146C0 00000000 00000000 00000000 03FE0000  [................]

B7F146D0 00000000 00000000 00000000 00000000  [................]

B7F146E0 00000008 00000000 00000000 6D4FD3DA  [..............Om]

B7F146F0 174CB0E0 9DA83EA6 62C7706F 00000102  [..L..>..op.b....]

B7F14700 00000000 00000000 5048BD84 00000609  [..........HP....]

B7F14710 0000060A 0000060B 0000060C 0000060D  [................]

B7F14720 0000060E 0000060F 00000610 00000611  [................]

B7F14730 00000612 00000613 00000614 00000615  [................]

B7F14740 00000A16 00000000 00000000 08000000  [................]

B7F14750 00000000 00000000 00000000 00000000  [................]

  Repeat 170 times

KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]







$ kfed read mpath16p1dump blknum=49|more

kfbh.endian:                          1 ; 0x000: 0x01

kfbh.hard:                          130 ; 0x001: 0x82

kfbh.type:                            3 ; 0x002: KFBTYP_ALLOCTBL

kfbh.datfmt:                          1 ; 0x003: 0x01

kfbh.block.blk:                      49 ; 0x004: blk=49

kfbh.block.obj:              2147483720 ; 0x008: disk=72

kfbh.check:                  2158685900 ; 0x00c: 0x80aaeecc

kfbh.fcn.base:                  4799766 ; 0x010: 0x00493d16

kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000

kfbh.spare1:                          0 ; 0x018: 0x00000000





$ kfed read mpath16p1dump blknum=50|more

kfbh.endian:                          1 ; 0x000: 0x01

kfbh.hard:                          130 ; 0x001: 0x82

kfbh.type:                            3 ; 0x002: KFBTYP_ALLOCTBL

kfbh.datfmt:                          1 ; 0x003: 0x01

kfbh.block.blk:                      50 ; 0x004: blk=50

kfbh.block.obj:              2147483720 ; 0x008: disk=72

kfbh.check:                  2158654602 ; 0x00c: 0x80aa748a

kfbh.fcn.base:                  4820845 ; 0x010: 0x00498f6d

kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000

kfbh.spare1:                          0 ; 0x018: 0x00000000

kfbh.spare2:                          0 ; 0x01c: 0x00000000

kfdatb10.aunum:                   21504 ; 0x000: 0x00005400

kfdatb10.shrink:                    448 ; 0x004: 0x01c0

1. When ASM rebalances the disks, does it read the block header first then write the block? 
Can a ASM Rebalance cause Block corruptions under any circumstances 
OR is this not possible within the ASM Internal mechanism ?

===>> When rebalance takes place asm do block by block checksum ,here in your case for block 48 ,ASM checksum failed as ASM didnot found asm formatted block .

      On 11.2.0.3 ,till now there is no reported bug at oracle end.

      

 Interestingly, I see from the dd dump that only block 48 is unformatted whereas earlier and later blocks were formatted properly for ASM allocation table metadata.

 

 And when I see the block 48 ,

 

 

$ kfed read mpath16p1dump blknum=48|more

kfbh.endian:                          0 ; 0x000: 0x00

kfbh.hard:                            0 ; 0x001: 0x00

kfbh.type:                            0 ; 0x002: KFBTYP_INVALID

kfbh.datfmt:                          0 ; 0x003: 0x00

kfbh.block.blk:                       0 ; 0x004: blk=0

kfbh.block.obj:                       0 ; 0x008: file=0

kfbh.check:                           0 ; 0x00c: 0x00000000

kfbh.fcn.base:                        0 ; 0x010: 0x00000000

kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000

kfbh.spare1:                          0 ; 0x018: 0x00000000

kfbh.spare2:                          0 ; 0x01c: 0x00000000

B7F14200 00000000 00000000 00000000 00000000  [................]

        Repeat 63 times

B7F14600 003C0000 00780000 00060000 007571ED  [..<...x......qu.]

B7F14610 003BFFF5 00000000 00000002 00000002  [..;.............]

B7F14620 00008000 00008000 00004000 5051A885  [.........@....QP]

B7F14630 50528195 001D0005 0003EF53 00000001  [..RP....S.......]

B7F14640 5048BD84 00ED4E00 00000000 00000001  [..HP.N..........]

B7F14650 00000000 0000000B 00000080 00000034  [............4...]

B7F14660 00000006 00000003 DB40F439 4643267C  [........9.@.|&CF]

B7F14670 5AB9A4A6 6703FEB5 00000000 00000000  [...Z...g........]

B7F14680 00000000 00000000 00000000 00000000  [................]

        Repeat 3 times

B7F146C0 00000000 00000000 00000000 03FE0000  [................]

B7F146D0 00000000 00000000 00000000 00000000  [................]

B7F146E0 00000008 00000000 00000000 6D4FD3DA  [..............Om]

B7F146F0 174CB0E0 9DA83EA6 62C7706F 00000102  [..L..>..op.b....]

B7F14700 00000000 00000000 5048BD84 00000609  [..........HP....]

B7F14710 0000060A 0000060B 0000060C 0000060D  [................]

B7F14720 0000060E 0000060F 00000610 00000611  [................]

B7F14730 00000612 00000613 00000614 00000615  [................]

B7F14740 00000A16 00000000 00000000 08000000  [................]

B7F14750 00000000 00000000 00000000 00000000  [................]

  Repeat 170 times

KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]



It seems some of the external values overwritten on that block.

So,could you please check few things.



1. Any OS level application which is running ,can write such string



2. Any Application level which is running ,can write such string.



remember only 4k block got impacted here .

2. When is the ASM header read ?

====>> This is not asm disk header ,rather it is on some internal asm metadata.

       

This kind read generally happens due to below situation.



a. When diskgroup get mounted and does diskgroup level recovery .



b. When you add disks and rebalence happens .

        
3. What causes ASM metadata to be updated ? Is this updated when the disk is added immediately, or when the rebalancing occurs?

====>>> This kind of asm metadata get updated when new allocation/deallocation took place at database level .
        
4. How is locking done on the ASM header between the RAC nodes and how is a lock released on an Oracle instance failure?

====>>> ASM keeps track of changes of each thread at asm diskgroup level and do required recovery on next mount .

       
5. Why did the Database carry on when the Header corruption error was first reported in the Alert log  –
This has been partially answered in the fact the error is only detected when the rebalance runs.

===>>> Unless you are going to read/write data which are pointed using that allocation table ,you are not going to see this issue .

       but when rebalance takes place ,it goes and touch all the blocks to read and make symmetrical stripping distribution of alrady existing

       allocation unit.

       

       Hence,this time it came into picture.

       

       So,this corruption took place between ,the start of asm rebalance and last DML operation on that block.
       
 
6.  How can we determine when was the last successful ASM Header read before the corruption ?

===>>> ASM diskgroup getting mounted ,so all asm disk headers are fine.

       ASM diskheader is different from allocation table metadata.

       even at the time disk all asm disk headers were read.