Backup and recovery should be the first item on any DBA’s list when administering a new environment.  Often the emphasis is placed on backups and that is only half of creating a successful backup and recovery solution for a production environment.  Making sure recovery is possible from a situation such as block corruption using the backups is the other and arguably the more important half.

As newer versions of Oracle more availability features and higher reliability, block corruption rarely happens. New recovery features, like table recovery in version 12c are also introduced over time.  This is a good thing, but the problem is practicing recovery methods when recovery is not necessary.  The only option most DBA’s have is recovering production database backups into a non-production  environment.  While this may be enough of a backup validation process for most, it leaves the question of knowing how to handle a situation where less than a full recovery of production data is necessary.  Documenting proven recovery steps for full as well as partial recoveries will help during times of crisis when tension is high.

Using a virtual environment, I will demonstrate how to manually corrupt a database so that recovery becomes necessary.  The actual steps to recover the database once it is damaged will not be covered in this post as it has already been documented extensively in both the Oracle documentation and other DBA posts.

My virtual environment will consist of a single-instance non-ASM non-container database version 12.1.0.2.0 on a Oracle Enterprise Linux 7u3 platform.  I will mention additional steps to perform this process on an ASM or Windows-hosted environment as well.  In this Linux environment, we will use the utility dd to perform the actual corruption.

After ensuring my database was in archivelog mode, I created my test objects:

SQL> create tablespace jontest_tsp datafile '/u01/app/oracle/oradata/JON12C/datafile/jontest_tsp1.dbf'
size 100M autoextend on next 1M extent management local segment space management auto;
  2  
Tablespace created.

SQL> create user jontest identified by semperf1 default tablespace jontest_tsp;

User created.

SQL> grant connect, resource to jontest;

Grant succeeded.

SQL> alter user jontest quota unlimited on jontest_tsp;

User altered.

SQL> conn jontest/semperf1
Connected.


SQL> create table table_dbc as select dbms_random.value(0,100) N1 from dual connect by level <= 1000;

Table created.

 

SQL> column segment_name format a10;

SQL> select segment_name, tablespace_name from user_segments where 
segment_name='TABLE_DBC';
  2  
SEGMENT_NA TABLESPACE_NAME
---------- ------------------------------
TABLE_DBC  JONTEST_TSP

Now that we have our test objects in place, it is time to get a backup of our environment.

RMAN> backup database plus archivelog delete input;

Starting backup at 07-JUN-17
current log archived
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=17 device type=DISK
channel ORA_DISK_1: starting archived log backup set
channel ORA_DISK_1: specifying archived log(s) in backup set
input archived log thread=1 sequence=4 RECID=1 STAMP=946036509
input archived log thread=1 sequence=5 RECID=2 STAMP=946040464
channel ORA_DISK_1: starting piece 1 at 07-JUN-17
channel ORA_DISK_1: finished piece 1 at 07-JUN-17
piece handle=/u01/app/oracle/fast_recovery_area/JON12C/backupset/2017_06_07/o1_mf_annnn_FULL_20170607_dmjhv16k_.bkp tag=FULL_20170607 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
channel ORA_DISK_1: deleting archived log(s)
archived log file name=/u01/app/oracle/fast_recovery_area/JON12C/archivelog/2017_06_07/o1_mf_1_4_dmjczcsb_.arc RECID=1 STAMP=946036509
archived log file name=/u01/app/oracle/fast_recovery_area/JON12C/archivelog/2017_06_07/o1_mf_1_5_dmjhv02b_.arc RECID=2 STAMP=946040464
Finished backup at 07-JUN-17

Starting backup at 07-JUN-17
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/JON12C/datafile/o1_mf_system_dmj83mqt_.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/JON12C/datafile/o1_mf_sysaux_dmj82jkj_.dbf
input datafile file number=00005 name=/u01/app/oracle/oradata/JON12C/datafile/jontest_tsp1.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/JON12C/datafile/o1_mf_undotbs1_dmj851z7_.dbf
input datafile file number=00006 name=/u01/app/oracle/oradata/JON12C/datafile/o1_mf_users_dmj850wr_.dbf
channel ORA_DISK_1: starting piece 1 at 07-JUN-17
channel ORA_DISK_1: finished piece 1 at 07-JUN-17
piece handle=/u01/app/oracle/fast_recovery_area/JON12C/backupset/2017_06_07/o1_mf_nnndf_TAG20170607T130106_dmjhv391_.bkp tag=TAG20170607T130106 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:01:25
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 07-JUN-17
channel ORA_DISK_1: finished piece 1 at 07-JUN-17
piece handle=/u01/app/oracle/fast_recovery_area/JON12C/backupset/2017_06_07/o1_mf_ncsnf_TAG20170607T130106_dmjhxrz3_.bkp tag=TAG20170607T130106 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02
Finished backup at 07-JUN-17

Starting backup at 07-JUN-17
current log archived
using channel ORA_DISK_1
channel ORA_DISK_1: starting archived log backup set
channel ORA_DISK_1: specifying archived log(s) in backup set
input archived log thread=1 sequence=6 RECID=3 STAMP=946040554
channel ORA_DISK_1: starting piece 1 at 07-JUN-17
channel ORA_DISK_1: finished piece 1 at 07-JUN-17
piece handle=/u01/app/oracle/fast_recovery_area/JON12C/backupset/2017_06_07/o1_mf_annnn_FULL_20170607_dmjhxt9k_.bkp tag=FULL_20170607 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
channel ORA_DISK_1: deleting archived log(s)
archived log file name=/u01/app/oracle/fast_recovery_area/JON12C/archivelog/2017_06_07/o1_mf_1_6_dmjhxt47_.arc RECID=3 STAMP=946040554
Finished backup at 07-JUN-17

RMAN>

If your particular environment is using ASM for datafile storage instead of the filesystem like my environment, follow these steps to copy the datafile to the OS where you can apply the corruption process.  Make note of the backup tag identifier.

alter database datafile '+DATA/RAC/DATAFILE/jontest_tsp1.df' offline;

rman target=/

RMAN> BACKUP AS COPY DATAFILE '+DATA/RAC/DATAFILE/jontest_tsp1.df' FORMAT '/tmp/jontest_tsp_%f';

After the datafile is out of ASM or the datafile is in a filesystem, check the header block as a SYSDBA user:

SQL> select header_file, header_block from dba_segments where segment_name='TABLE_DBC';

HEADER_FILE HEADER_BLOCK
----------- ------------
   5      130

The header of segment (table) is block 130, so if we damage the next block with dd, we will cause a datafile/table corruption.

[oracle@localhost ~]$ dd of=/u01/app/oracle/oradata/JON12C/datafile/jontest_tsp1.dbf bs=8192 conv=notrunc seek=131 <<EOF
> #corrupt_this#
> EOF
0+1 records in
0+1 records out
15 bytes (15 B) copied, 8.3023e-05 s, 181 kB/s

Execute DBVERIFY to confirm that we have generated some corruption:

[oracle@localhost ~]$ dbv file=/u01/app/oracle/oradata/JON12C/datafile/jontest_tsp1.dbf blocksize=8192

DBVERIFY: Release 12.1.0.2.0 - Production on Wed Jun 7 14:08:25 2017

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /u01/app/oracle/oradata/JON12C/datafile/jontest_tsp1.dbf
Page 131 is marked corrupt
Corrupt block relative dba: 0x01400083 (file 5, block 131)
Bad header found during dbv: 
Data in bad block:
 type: 35 format: 3 rdba: 0x74707572
 last change scn: 0x2373.6968745f seq: 0xa flg: 0x04
 spare1: 0x6f spare2: 0x72 spare3: 0x0
 consistency value in tail: 0x8ca90602
 check value in block header: 0xb1ea
 computed block checksum: 0x176

As an interesting side note, even though DBVERIFY noticed the corruption, if we queried v$database_block_corruption, we would not see any entries:

SQL> select * from v$database_block_corruption;

no rows selected

If we immediately query the affected table, the database will see the corruption:

SQL> select count(*)
  2  from table_dbc;
select count(*)
       *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 5, block # 131)
ORA-01110: data file 5: '/u01/app/oracle/oradata/JON12C/datafile/jontest_tsp1.dbf'

The v$database_block_corruption view now has the appropriate entries

SQL> select * from v$database_block_corruption;

     FILE#     BLOCK#   BLOCKS CORRUPTION_CHANGE# CORRUPTIO   CON_ID
---------- ---------- ---------- ------------------ --------- ----------
  5   131        1    0 CORRUPT        0

RMAN will also populate this view if it encounters corrupt blocks.  This is why it is a good practice to either review RMAN backup logfiles or query the v$database_block_corruption view at the end of a backup.

If this was an ASM datafile, restore it to the original location:

rman target /

RUN
{
SET MAXCORRUPT FOR DATAFILE '+DATA/RAC/DATAFILE/jontest_tsp1.df' TO 3;
RESTORE DATAFILE '+DATA/RAC/DATAFILE/jontest_tsp1.df' FROM TAG
'TAG20120303T111330';
RECOVER DATAFILE '+DATA/RAC/DATAFILE/jontest_tsp1.df';
}

After restore and recover the datafile must be brought online

alter database datafile '+DATA/RAC/DATAFILE/jontest_tsp1.df' online;

Now you can practice a datafile, table, or block recovery.

If we wanted to corrupt a datafile in a Windows environment, we could use an editor such as UltraEdit GUI editor, which is also available in a Linux version.  Using this editor, we can either modify the header of the file to corrupt the datafile, or scroll to the table data to corrupt just the table to practice table or block recovery.

I know this goes without saying, but just like the warning labels on hair dryers that tell you not to use them in the bathtub, YOU SHOULD NOT PERFORM THIS PROCESS AGAINST A PRODUCTION DATABASE, EVEN IF YOU CREATE A SEPARATE TABLESPACE THAT CONTAINS NO PRODUCTION DATA.  You know the reason for most warning labels is that someone probably did the very thing the label warns against doing.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s