Parnassusdata Software Database Recovery Team
Service Hotline: +86 13764045638 E-mail: service@parnassusdata.com
STARTUP HANGS If the database hangs on startup: 1.) Instruct the customer to do a STARTUP NOMOUNT (to see if the background processes will start). 2.) Try an ALTER DATABASE MOUNT. 3.) Try doing some SELECTs from any v$ view. 4.) If this works, you can do an alter session set _trace_enabled=true in the init.ora. 5.) Then do an ALTER DATABASE OPEN. 6.) After the db has been hanging for a minute or so, use CTRL/C (depress and hold the CTRL key while pressing the 'c' key) to stop the process. See if the trace tells you which SQL statement it is hanging on (it could be dictionary corrupt). TABLESPACE, LOST DATAFILE After a tablespace has been created with its datafiles, the datafiles must exist for the life of the tablespace unless all objects in the tablespace are dropped first. The supported way to recover from a lost datafile is to have the customer restore the old datafile from an older, cold backup (full backup) or a hot backup (single tablespace backup while the database is online). If the database is in NOARCHIVELOG mode, you will only succeed in recovering the db if the datafile in the redo to be applied to it is within the range of your online REDO logs. If the customer has no backups of the datafile that is corrupt, there is a chance the events 10231 and 10233 can be set to skip the corrupted blocks so an export can be done. If that doesn't work or the corruption is in the datafile header, they will loose their data. CONTROL FILES If you are mirroring control files, and one is bad, delete it and copy the good one in its place. If you need to create a new control file or change the MAXLOGFILES, MAXLOGMEMBERS, MAXDATAFILES, MAXINSTANCES, or MAXLOGHISTORY parameters: SVRMGRL> alter database backup controlfile to trace; Edit the trace file (it will be the latest one), take out the trace file header and trailer text and you have the commands to re-create your trace file. You can rename the file to something meaningful then @ the file from SVRMGRL. Be sure to shutdown the database and remove the old controlfiles first. SKIPPING BAD BLOCK When oracle hits a data block that is corrupt, it returns an ORA-1578 error. To skip bad blocks to salvage data, you can set the following events in the init.ora: set events '10231 trace name context forever, level 10' (table set events '10233 trace name context forever, level 10' (index Be sure to drop the indexes first if the corruption is in a table as the 10231 will only work on a full table scan. The 10232 will check for corrupted blocks on modified index blocks and will mark the index as unusable. To check for index corruption, set this event and rebuild the index. ANALYZE INDEX VALIDATE STRUCTURE ANALYZE INDEX VALIDATE STRUCTURE may not reveal index corruption. ONLINE REDO LOG In the normal operation of a database, committed row changes are written to the online redo logs as changes are made to the database. The redo log data is used to recover the database should something happen. When a database is in archivelog mode, these redo logs are copied as they are filled to a directory where they can be used during recovery. If a database is not in archivelog mode, the redo logs will be filled in a round robin fashion. When the last log is filled, the first one is filled again, overwriting what was there before. The window of recovery is very short and if there are any datafiles that are corrupted, there is no recovery of that data WHAT ARE THE IMPLICATIONS OF OPEN RESETLOGS If the customer is in ARCHIVELOG mode they will no longer be able to apply any of their archive logs (roll forward). They will have to take another cold backup of their database and start a new set of archive logs. ROLL FORWARD/ROLL BACK VERVIE Hidden (undocumented) init.ora parameters can be used for many things including bypassing database safety checks during recovery. Be sure to tell The customer this is unsupported and may corrupt their database worse than it already is. However, if the customer has no backups and is not in archive log mode, they may have no choice. ALLOW_RESETLOGS_CORRUPTION=TRUE A common situation is corruption in the online redo logs. Usually the problem is the instance crashed and there are transactions in the redo that must be rolled back, but the redo is corrupt. Here are the steps to ATTEMPT to force the database open: 1) Add the parameter _ALLOW_RESETLOGS_CORRUPTION=TRUE to the init.ora file 2) STARTUP MOUNT the database 3) Issue the appropriate RECOVER DATABASE command: a) RECOVER DATABASE UNTIL CANCEL (Their controlfile is usable) -orb) RECOVER DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL RECOVER DATABASE USING BACKUP CONTROLFILE (Their controlfile is not the current one) 4) Enter CANCEL to cancel recovery - there are NO archive logs applied 5) Enter ALTER DATABASE OPEN RESETLOG 6) Try selecting from a table (i.e., SELECT SYSDATE FROM DUAL) Once the database is open, it is imperative that they export, rebuild the database, and then import. ROLLBACK SEGMENT PROBLEMS A rollback segment is a structure that holds before-commit data for a data table. If a datafile containing a rollback segment is gone, the transaction that was open at the time of the crash cannot be committed. This is a problem because the data block is marked as having an active transaction but the data is still sitting in the (missing) rollback segment. "Hidden" parameters in the init.ora. They can get around this but there will probably be logical data corruption. Have them set: _offline_rollback_segments = (,... STARTUP MOUNT and RECOVER DATABASE Either get the trace or have them search for "obj" in the trace file which should reveal the object trying to rollback. If you can locate the object, comment out the ROLLBACK_SEGMENTS parameter, open the database and try and drop the object. Then uncomment the ROLLBACK_SEGMENTS parameter and attempt to open the database. As a last resort you can set the hidden parameter: _corrupted_rollback_segments You can have the customer change: rollback_segments = ( ,...., - to - _corrupted_rollback_segments = ( ,...., The above list should contain all the rollbacks originally listed in the ROLLBACK_SEGMENTS parameter. SVRMGRL> startup open SVRMGRL> ALTER DATABASE DROP TABLESPACE rollback_tbs INCLUDING CONTENTS There may be corruption in the database depending on what was happening at the time of the problem. DATABASE HANGING If the database is hanging and there are no errors, the best thing to do is have them get a systemstate dump (ALTER SESSION, etc. and have them either ftp or email it to you. Then get a copy of the ass.awk and run it against the trace file. Objects in brackets are what is being waited. BACKUP AND RESTORE If a customer has a corrupted database and needs to restore from a cold backup and has archive logs, have them restore the cold backup, make sure all the archive logs are in the archive log destination directory. Then have them do a STARTUP MOUNT, then RECOVER DATABASE. ALTER DATABASE RECOVER DATABASE. They will be prompted for archive logs. They can also set the recovery to be automatic so they are not prompted for each log. (See the ALTER DATABASE RECOVER command for more options.)