Category: RMAN

Oct 09 2009

RMAN – Recovery from missing or corrupted redo log group

Case 1: A multiplexed copy of the missing log is available.

If a redo log is missing, it should be restored from a multiplexed copy, if possible. This is the only way to recover without any losses. Here’s an example, where I attempt to startup from SQLPlus when a redo log is missing:

SQL> startup
ORACLE instance started.

Total System Global Area 131555128 bytes
Fixed Size 454456 bytes
Variable Size 88080384 bytes
Database Buffers 41943040 bytes
Redo Buffers 1077248 bytes
Database mounted.
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1: ‘D:\ORACLE_DATA\LOGS\ORCL\REDO03A.LOG’

SQL>
To fix this we simply copy REDO03A.LOG from its multiplexed location on E: to the above location on E:. After copying the file, we issue an “alter database open” from the above SQLPlus session:

SQL> alter database open;

Database altered.

SQL>

That’s it – the database is open for use.

Case 2: All members of a log group lost.

In this case an incomplete recovery is the best we can do. We will lose all transactions from the missing log and all subsequent logs. We illustrate using the same example as above. The error message indicates that members of log group 3 are missing. We don’t have a copy of this file, so we know that an incomplete recovery is required. The first step is to determine how much can be recovered. In order to do this, we query the V$LOG view (when in the mount state) to find the system change number (SCN) that we can recover to (Reminder: the SCN is a monotonically increasing number that is incremented whenever a commit is issued):

–The database should be in the mount state for v$log access

SQL> select first_change# from v$log where group#=3 ;

FIRST_CHANGE#
————-
370255

SQL>

The FIRST_CHANGE# is the first SCN stamped in the missing log. This implies that the last SCN stamped in the previous log is 370254 (FIRST_CHANGE#-1). This is the highest SCN that we can recover to. In order to do the recovery we must first restore ALL datafiles to this SCN, followed by recovery (also up to this SCN). This is an incomplete recovery, so we must open the database resetlogs after we’re done. Here’s a transcript of the recovery session (typed commands in bold, comments in italics, all other lines are RMAN feedback):

C:\>rman target /

Recovery Manager: Release 9.2.0.4.0 – Production

Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.

connected to target database: ORCL (DBID=1507972899)

–Restore ENTIRE database to determined SCN

RMAN> restore database until scn 370254;

Starting restore at 26/JAN/05

using channel ORA_DISK_1
using channel ORA_DISK_2
channel ORA_DISK_1: starting datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
restoring datafile 00001 to D:\ORACLE_DATA\DATAFILES\ORCL\SYSTEM01.DBF
restoring datafile 00004 to D:\ORACLE_DATA\DATAFILES\ORCL\USERS01.DBF
channel ORA_DISK_2: starting datafile backupset restore
channel ORA_DISK_2: specifying datafile(s) to restore from backup set
restoring datafile 00002 to D:\ORACLE_DATA\DATAFILES\ORCL\UNDOTBS01.DBF
restoring datafile 00003 to D:\ORACLE_DATA\DATAFILES\ORCL\TOOLS01.DBF
channel ORA_DISK_2: restored backup piece 1
piece handle=E:\BACKUP\13GB14IB_1_1.BAK tag=TAG20050124T171139 params=NUL
channel ORA_DISK_2: restore complete
channel ORA_DISK_1: restored backup piece 1
piece handle=E:\BACKUP\14GB14IB_1_1.BAK tag=TAG20050124T171139 params=NUL
channel ORA_DISK_1: restore complete
Finished restore at 26/JAN/05

–Recover database

RMAN> recover database until scn 370254;

Starting recover at 26/JAN/05
using channel ORA_DISK_1
using channel ORA_DISK_2

starting media recovery

archive log thread 1 sequence 9 is already on disk as file E:\ORACLE_ARCHIVE\ORCL\1_9.ARC
archive log thread 1 sequence 10 is already on disk as file E:\ORACLE_ARCHIVE\ORCL\1_10.ARC
archive log thread 1 sequence 11 is already on disk as file E:\ORACLE_ARCHIVE\ORCL\1_11.ARC
archive log thread 1 sequence 12 is already on disk as file E:\ORACLE_ARCHIVE\ORCL\1_12.ARC
archive log filename=E:\ORACLE_ARCHIVE\ORCL\1_9.ARC thread=1 sequence=9
archive log filename=E:\ORACLE_ARCHIVE\ORCL\1_10.ARC thread=1 sequence=10
media recovery complete
Finished recover at 26/JAN/05

–open database with RESETLOGS (see comments below)

RMAN> alter database open resetlogs;

database opened

RMAN>

The following points should be noted:
The entire database must be restored to the SCN that has been determined by querying v$log.
All changes beyond that SCN are lost. This method of recovery should be used only if you are sure that you cannot do better. Be sure to multiplex your redo logs, and (space permitting) your archived logs!
The database must be opened with RESETLOGS, as a required log has not been applied. This resets the log sequence to zero, thereby rendering all prior backups worthless. Therefore, the first step after opening a database RESETLOGS is to take a fresh backup. Note that the RESETLOGS option must be used for any incomplete recovery.

Oct 08 2009

RMAN- Recovery from block corruption

It is possible to recover corrupted blocks using RMAN backups.
Here’s the situation: a user connected to SQLPlus gets a data block corruption error when queries a table.
Here’s a part of the session transcript:

SQL> connect testuser/testpassword
Connected.
SQL> select count(*) from test_table;
select count(*) from test_table
*
ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 4, block # 2015)
ORA-01110: data file 4: ‘D:\ORACLE_DATA\DATAFILES\ORCL\USERS01.DBF’

Since we know the file and block number, we can perform block level recovery using RMAN. This is best illustrated by example:

C:\>rman target /

Recovery Manager: Release 9.2.0.4.0 – Production

Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.

connected to target database: ORCL (DBID=1507972899)

–restore AND recover specific block

RMAN> blockrecover datafile 4 block 2015;

Starting blockrecover at 26/JAN/05
using target database controlfile instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=19 devtype=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: sid=20 devtype=DISK

channel ORA_DISK_1: restoring block(s)
channel ORA_DISK_1: specifying block(s) to restore from backup set
restoring blocks of datafile 00004
channel ORA_DISK_1: restored block(s) from backup piece 1
piece handle=E:\BACKUP\0QGB0UEC_1_1.BAK tag=TAG20050124T152708 params=NULL
channel ORA_DISK_1: block restore complete

starting media recovery
media recovery complete

Finished blockrecover at 26/JAN/05

RMAN>

Now our user should be able to query the table from her SQLPlus session. Here’s her session transcript after block recovery.

SQL> select count(*) from test_table;

COUNT(*)
———-
217001

SQL>

A couple of important points regarding block recovery:

Block recovery can only be done using RMAN.
The entire database can be open while performing block recovery.
Check all database files for corruption. This is important – there could be other corrupted blocks.
Verification of database files can be done using RMAN or the dbverify utility.
To verify using RMAN simply do a complete database backup with default settings.
If RMAN detects block corruption, it will exit with an error message pointing out the guilty file/block.

Oct 08 2009

RMAN – Recovery from missing or corrupted datafile

Recovery from corrupted or missing datafile

This scenario deals with a situation where a datafile has gone missing, or is corrupted beyond repair. For concreteness, we look at a case where a datafile is missing. Below is a transcript of an SQL Plus session that attempts to open a database with a missing datafile (typed commands in bold, lines in italics are my comments, all other lines are feedback from SQL Plus):

–open SQL Plus from the command line without
–logging on to database

C:\>sqlplus /nolog

SQL*Plus: Release 9.2.0.4.0 – Production on Tue Jan 25 14:52:41 2005

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

–Connect to the idle Oracle process as a privileged user and start up instance

SQL> connect / as sysdba
Connected to an idle instance.
SQL> startup
ORACLE instance started.

Total System Global Area 131555128 bytes
Fixed Size 454456 bytes
Variable Size 88080384 bytes
Database Buffers 41943040 bytes
Redo Buffers 1077248 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 4 – see DBWR trace file
ORA-01110: data file 4: ‘D:\ORACLE_DATA\DATAFILES\ORCL\USERS01.DBF’

SQL>

The error message tells us that file# 4 is missing. Note that although the startup command has failed, the database is in the mount state. Thus, the database control file, which is also the RMAN repository can be accessed by the instance and by RMAN. We now recover the missing file using RMAN. The transcript of the recovery session is reproduced below (bold lines are typed commands, comments in italics, the rest is feedback from RMAN):

–logon to RMAN

C:\>rman target /

Recovery Manager: Release 9.2.0.4.0 – Production

Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.

connected to target database: ORCL (DBID=1507972899)

–restore missing datafile

RMAN> restore datafile 4;

Starting restore at 26/JAN/05

using target database controlfile instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=14 devtype=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: sid=15 devtype=DISK
channel ORA_DISK_1: starting datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
restoring datafile 00004 to D:\ORACLE_DATA\DATAFILES\ORCL\USERS01.DBF
channel ORA_DISK_1: restored backup piece 1
piece handle=D:\BACKUP\0QGB0UEC_1_1.BAK tag=TAG20050124T152708 params=NULL
channel ORA_DISK_1: restore complete
Finished restore at 26/JAN/05

–recover restored datafile – RMAN applies all logs automatically

RMAN> recover datafile 4;

Starting recover at 26/JAN/05 using channel ORA_DISK_1
using channel ORA_DISK_2

starting media recovery

archive log thread 1 sequence 4 is already on disk as file E:\ORACLE_ARCHIVE\ORCL\1_4.ARC
archive log thread 1 sequence 5 is already on disk as file C:\ORACLE_ARCHIVE\ORCL\1_5.ARC
archive log thread 1 sequence 6 is already on disk as file E:\ORACLE_ARCHIVE\ORCL\1_6.ARC
archive log thread 1 sequence 7 is already on disk as file E:\ORACLE_ARCHIVE\ORCL\1_7.ARC
archive log filename=E:\ORACLE_ARCHIVE\ORCL\1_4.ARC thread=1 sequence=4
archive log filename=C:\ORACLE_ARCHIVE\ORCL\1_5.ARC thread=1 sequence=5
media recovery complete
Finished recover at 26/JAN/05

–open database for general use

RMAN> alter database open;

database opened

RMAN>

In the above scenario, the database is already in the mount state before the RMAN session is initiated. If the database is not mounted, you should issue a “startup mount” command before attempting to restore the missing datafile. The database must be mounted before any datafile recovery can be done.

If the database is already open when datafile corruption is detected, you can recover the datafile without shutting down the database. The only additional step is to take the relevant tablespace offline before starting recovery. In this case you would perform recovery at the tablespace level. The commands are:

C:\>rman target /

Recovery Manager: Release 9.2.0.4.0 – Production

Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.

connected to target database: ORCL (DBID=1507972899)

–offline affected tablespace

RMAN> sql ‘alter tablespace USERS offline immediate’;

using target database controlfile instead of recovery catalog
sql statement: alter tablespace USERS offline immediate

–recover offlined tablespace

RMAN> recover tablespace USERS;

Starting recover at 26/JAN/05
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=14 devtype=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: sid=12 devtype=DISK

starting media recovery
media recovery complete

Finished recover at 26/JAN/05

–online recovered tablespace

RMAN> sql ‘alter tablespace USERS online’;

sql statement: alter tablespace USERS online

RMAN>

Here we have used the SQL command, which allows us to execute arbitrary SQL from within RMAN.

Sep 15 2009

RMAN Incomplete Recovery

How to perform incomplete RMAN Recovery. We can achive it by using three options.

        until timestamp

        until SCN

        until SEQUENCE#

The following example performs an incomplete recovery until september 16th of 07:00AM
RUN
{
  SET UNTIL TIME ‘Sep 16 2009 07:00:00”;
  # SET UNTIL SCN 1000;       # alternatively, you can specify SCN
  # SET UNTIL SEQUENCE 9923;  # alternatively, you can specify log
sequence number
  RESTORE DATABASE;
  RECOVER DATABASE;
}

Which is best UNTIL Timestamp,SCN or Sequence#

When using ‘until time’, internally, this will be translated
into a SCN and there could be 5 second margin so the resulting SCN could
be slightly different than
expected.

Therefore, the better option is to use ‘until SCN’ as no translation is
required than.

The max you could specify depends on the archivelog files availability.

Alibi3col theme by Themocracy