Email: [email protected] 7 x 24 online support!
Oracle ASM ORA-15063 / ORA-15042 - TROUBLESHOOTING STEPS BEFORE OPENING a SR to Oracle Support
	APPLIES TO:
	Oracle Database - Enterprise Edition
	Oracle Database Cloud Schema Service - Version N/A and later
	Oracle Database Exadata Cloud Machine - Version N/A and later
	Oracle Cloud Infrastructure - Database Service - Version N/A and later
	Oracle Database Cloud Exadata Service - Version N/A and later
	Information in this document applies to any platform.
	PURPOSE
	Self-debugging steps when a diskgroup cannot be mounted due to error ORA-15063:
	ORA-15063: ASM discovered an insufficient number of disks for diskgroup s%
	ORA-15040: diskgroup is incomplete
	ORA-15042: ASM disk "%" is missing 
	TROUBLESHOOTING STEPS
	SECTION A - Getting started
	Start by refering  NOTE 452770.1 "TROUBLESHOOTING - ASM disk not found/visible/discovered issues "
	Firstly  identify all disks being part of the affected diskgroup by looking at last successful mount in alert_+ASM*.log.
	You should search for a section as below:
	SQL> ALTER DISKGROUP <DGNAME1> MOUNT /* asm agent *//* {0:0:214} */
	NOTE: cache registered group DATA number=1 incarn=0x44bef6bb
	NOTE: cache began mount (not first) of group DATA number=1 incarn=0x44bef6bb
	NOTE: Loaded library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so
	NOTE: Assigning number (1,0) to disk (ORCL:DATA01P)
	NOTE: Assigning number (1,1) to disk (ORCL:DATA02P)
	NOTE: Assigning number (1,2) to disk (ORCL:DATA03P)
	NOTE: Assigning number (1,3) to disk (ORCL:DATA04P)
	NOTE: Assigning number (1,4) to disk (ORCL:DATA05P)
	..
	NOTE: cache opening disk 0 of grp 1: DATA01P label:DATA01P
	NOTE: cache opening disk 1 of grp 1: DATA02P label:DATA02P
	..
	SUCCESS: DISKGROUP <DGNAME1> was mounted
	NOTE: When ASMLIB is not used the path to ASM disk is specified within the mount section:
	 NOTE: cache opening disk 1 of grp 1: REDO3_0001 path:/dev/mpath/3600601600ba12c00d4b784363e69e211 
	 NOTE: cache opening disk 2 of grp 1: REDO3_0002 path:/dev/mpath/3600601600ba12c00d4b784363e69e212 
	 ...
	Isolate the device(s) reported as "missing" as note 452770.1 suggested.
	Finally start your checks as follow:
	A1) If there is any IO/storage/multipathing errors reported in OS logs - investigate and fix them.
	This step is mandatory as usually ORA-15063/ORA-15042 are caused by underlying IO/storage errors .  
	A2) If devices used by ASM disks are properly presented and configured at OS level.
	If additionally "ORA-15075: disk(s) are not visible cluster-wide" is reported, make sure that all devices are cluster-wide visible.
	A3) If all ASM disks have appropriate permissions (eg: they should be owned by grid owner)
	If ownership of ASM disk(s) has been changed for whatever reason, please correct that.
	A4) If/how the "missing" device(s) is reported when querying v$asm_disks
	-----------------------------------------------------------------------------------
	If the device(s) is reported with status:
	=> "PROVISIONED/CANDIDATE" - this means the header of ASM disk(s) is damaged.
	    -> investigate the IO problems behind the corruption - see  step A1. Oracle never wipes out its metadata!! A checksum is made for every write before  being accepted.
	    -> check the header status, in order to confirm the damage:   
	$> kfed read <path_to_your_missing_devices>
	        kfbh.endian:                          0 ; 0x000: 0x00
	        kfbh.hard:                            0 ; 0x001: 0x00
	        kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
	        kfbh.datfmt:                          0 ; 0x003: 0x00
	        kfbh.block.blk:                       0 ; 0x004: blk=0
	        kfbh.block.obj:                       0 ; 0x008: file=0
	        ....
	    ->  try to repair the header and see if diskgroup can be mounted:                  
	$> kfed repair <path_to_your_missing_devices>
	    -> check the if there is additional corruptions reported by ASM (eg ORA-15196) or by your database - as IO/storage problems could affect more than one block.
	    If any corruption is seen please open a SR to Oracle Support.
	 NOTE:  
	 1) When non-default AU size is used AUSZ=<au_size> must be specified with each KFED command.
	 2) "kfed repair" works for 11g ONLY!
	=> "UNKNOWN/IGNORED" - this means the ASM disk(s) is not seen at OS level.
	    -> review steps A1,A2 and A3:         
	-----------------------------------------------------------------------------------   
	A5) If asm_diskstring is still properly set.
	On Windows configuration, you can also refer NOTE 880061.1 "ASM Is Unable To Detect SCSI Disks On Windows"    
	SECTION B - ASMLIB is used
	When ASMLIB is used, follow the above steps (section A) and also check the errors associated with ORA-15063:
	B1) ORA-15183 Unable to initialize the ASMLIB in oracle/ORA-15183: ASMLIB initialization error [driver/agent not installed]
	Refer: NOTE 340519.1 Cannot Start ASM Ora-15063/ORA-15183
	B2) ORA-15186: ASMLIB error function = [asm_open], error = [1], mesg = [Operation not permitted]
	Check your ASMLIB health.
	 => correctness of installed rpm's
	 => correctness of symlinks - all nodes should show:
	    # ls -l  /etc/sysconfig/oracleasm
	       lrwxrwxrwx 1 root root 24 Sep 18 22:10 /etc/sysconfig/oracleasm -> oracleasm-_dev_oracleas
	 => correctness of ASMLIB configuration (/etc/sysconfig/oracleasm) -    when multipathing is used:
	     # ORACLEASM_SCANORDER: Matching patterns to order disk scanning
	        ORACLEASM_SCANORDER="dm"
	     # ORACLEASM_SCANEXCLUDE: Matching patterns to exclude disks from scan
	        ORACLEASM_SCANEXCLUDE="sd"
	B3) Check if ASMLIB disks are listed under /dev/oracleasm/disks
	=> devices under /dev/oracleasm/disks/* must be reported as dm devices on all nodes (not single path device -sd*-).If not, please correct that! (see step B2)
	$> ls -al /dev/oracleasm/disks
	brw-rw---- 1 grid dba 253, 29 Feb 12 11:44 /dev/oracleasm/disks/DATA01P
	brw-rw---- 1 grid dba 253, 35 Feb 12 11:44 /dev/oracleasm/disks/DATA02P
	brw-rw---- 1 grid dba 253, 27 Feb 15 16:04 /dev/oracleasm/disks/DATA03P
	brw-rw---- 1 grid dba 253, 24 Feb 12 11:44 /dev/oracleasm/disks/DATA04P
	brw-rw---- 1 grid dba 253, 25 Feb 12 11:44 /dev/oracleasm/disks/DATA05P
	=> If one of your ASMLIB disk(s) is missing from the above output,  first try to re-scan devices, as root:
	 # /etc/init.d/oracleasm scandisks
	=> If ASMLIB disk(s) is still missing from /dev/oracleasm/disks,  engage your sysadmin to investigate this (see steps A1, A2, A3).
	B4) Check if ASMLIB disk(s) has the correct ASMLIB stamp and status:
	 $> kfed read <ASMLIB_device> |grep provstr
	      kfdhdb.driver.provstr: ORCLDISK<diskname> ; 0x000: length=20
	 $> kfed read <ASMLIB_device> | egrep 'kfbh.type|kfdhdb.dskname|kfdhdb.hdrsts'
	      kfbh.type:      1 ; 0x002: KFBTYP_DISKHEAD 
	      kfdhdb.dskname: DATA01P ; 0x028: length=14
	      kfdhdb.hdrsts:  3 ; 0x027: KFDHDR_MEMBER     
	=> If the output is "kfdhdb.driver.provstr: ORCLCLRD" (but kfdhdb.hdrsts= MEMBER and kfbh.type=KFBTYP_DISKHEAD)  then your disk was deleted using "oracleasm deletedisk".
	=> If  kfbh.type = KFBTYP_INVALID  -> see step A4)  and check if "kfed repair" could fix the problem.
	B5)Refer also the below documents:
	NOTE: 398622.1     ORA-15186: ASMLIB error function = [asm_open], error = [1], mesg = [Operation not permitted]
	NOTE: 1384504.1   Mount ASM Disk Group Fails : ORA-15186, ORA-15025, ORA-15063  
	NOTE: 967461.1    "Multipath: error getting device" seen in OS log causes ASM/ASMlib to shutdown by itself
	NOTE: 1526920.1   ORA-15186 ORA-15063 on node 2
	SECTION C  -  Additional notes to review
	If the above checks are done, but error still persists, please review also the below notes, depending on your configuration/situation:
	NOTE:  577526.1     ORA-15063 ASM Discovered An Insufficient Number Of Disks For Diskgroup using NetApp Storage
	NOTE:  784776.1     ORA-15063 When Mounting a Diskgroup After Storage Cloning ( BCV / Split Mirror / SRDF / HDS / Flash Copy )
	NOTE:  555918.1     ORA-15038 On Diskgroup Mount After Node Eviction
	NOTE:  1484723.1   ASM Candidate Raw Device Is Not Presented As A RAC Cluster Wide Shared character Devices On Unix.
	NOTE:  1534211.1   ORA-15017 and ORA-15063 errors for unused diskgroups in 11.2
	NOTE:  1487443.1   Mounting Diskgroup Fails With ORA-15063 and V$ASM_DISK Shows PROVISIONED
	NOTE:  742832.1     AIX:After changing Multipathing drivers from RDAC to MPIO ASM discovered an insufficient number of disks
	NOTE:  1276913.1   Unable to discover or use raw devices for ASM in HP-UX Itanium in 11.2.0.2 ( ORA-15063 )
	SECTION D  - Information to be collected when are you going to open a SR 
	If you are not able to fix the problem on your own, please collect the below information and raise a SR to Oracle Support
	D1) alert_+ASM*.log (from all nodes if RAC)
	D2) script#1 from NOTE 470211.1 How To Gather/Backup ASM Metadata In A Formatted Manner version 10.1, 10.2, 11.1 & 11.2?
	D3) KFED reports
	#! /bin/sh
	rm /tmp/kfed_DH.out /tmp/kfed_BK.out 
	for i in `ls <your_path_to_asm_disks>`
	 do
	 echo $i >> /tmp/kfed_DH.out
	 kfed read $i >> /tmp/kfed_DH.out
	 echo $i >> /tmp/kfed_BK.out
	 kfed read $i aun=1 blkn=254  >> /tmp/kfed_BK.out     
	done
	Run kfed.sh in as GRID/ASM owner. Upload /tmp/kfed_DH.out, /tmp/kfed_BK.out
	! Pay attention to non-default AU size - if a non-default AU size is used the  you must specify it. (see note 1485597.1 "ASM tools used by Support : KFOD, KFED, AMDU")
	D4) ASMLIB information
	NOTE : 869526.1 Collecting The Required Information For Support To Troubleshot ASM/ASMLIB Issues.
	D5) List of your ASM devices
	   $> ls -al <path_to_ASM_devices>
	D6) OS logs (from all nodes if this is RAC configuration)
	SECTION E  - Disk is reported as MISSING after a failed disk addition
	 If you are facing ORA-15063 after a failed disk addition, please collect the below information and raise a SR to Oracle Support
	E1) alert_+ASM*.log (from all nodes if RAC)
	E2) script#1 from NOTE 470211.1 How To Gather/Backup ASM Metadata In A Formatted Manner version 10.1, 10.2, 11.1 & 11.2?
	E3) KFED reports
	#! /bin/sh
	rm /tmp/kfed_*.out 
	for i in `ls <your_path_to_asm_disks>`
	 do
	 echo $i >> /tmp/kfed_DH.out
	 kfed read $i >> /tmp/kfed_DH.out
	 echo $i >> /tmp/kfed_BK.out
	 kfed read $i aun=1 blkn=254  >> /tmp/kfed_BK.out 
	 echo $i >> /tmp/kfed_PST.out
	 kfed read $i aun=1 blkn=2 >> /tmp/kfed_PST.out
	 echo $i >> /tmp/kfed_FS.out
	 kfed read $i blkn=1 >> /tmp/kfed_FS.out
	 echo $i >> /tmp/kfed_FD.out
	 kfed read $i aun=2 blkn=1 >> /tmp/kfed_FD.out
	 echo $i >> /tmp/kfed_DD.out
	 kfed read $i aun=2 blkn=0 >> /tmp/kfed_DD.out  ##there might be more than one block needed if a large number of disks -> this might be asked later by Oracle Support
	done
	Run kfed.sh in as GRID/ASM owner. Upload /tmp/kfed_*.out
	! Pay attention to non-default AU size - if a non-default AU size is used the  you must specify it. (see note 1485597.1 "ASM tools used by Support : KFOD, KFED, AMDU")
	E4) AMDU output
	amdu -diskstring '<ASM_DISKSTRING>' -dump '<DISKGROUP_NAME>' -noimage
	amdu -diskstring '<ASM_DISKSTRING>' -print <DISKGROUP_NAME>.F2.V0.C2 > DG.amdu
	####F2.V0.C2  --> This will only extract up to 16 disks information. If there is a large number of disks, a larger output is needed
