Block corruptions are one of the common source of database outages. A database block is corrupted when its content has changed from what Oracle Database expects. Correcting block corruptions can prove to be a lengthy and costly exercise and requires good expertise 0n the DBA part.
Block Corruption can happen at Logical level or Physical level. Also, such corruption can impact a single block or huge number of blocks in the database.
PHYSICAL CORRUPTION
- Also called media corruption
- Inconsistency between header and footer is one of the symptom of physical corruption. There can be an invalid checksum or header, or when the
block contains all zeroes. - Generally the result of infrastructure problems like OS/Storage issues, faulty disks/disks controllers, Memory issues.
- Oracle Recovery Manager’s BLOCKRECOVER command can help in recovering from Physical Corruption
LOGICAL CORRUPTION
- Also called soft corruption
- Internal inconsistency in the block while the block may have good header and footer. The block checksum will be correct but the block structures may be corrupt.
- Lost write can also cause Logical corruption. A lost write is a write I/O to persistent storage that the database believes has occurred based on
information from the I/O subsystem - Can show up as a result of a failure in the Oracle software or some bug, cache corruption etc.
- Inter-block corruption, the corruption that occurs between blocks can only be a logical corruption
IMPORTANT DATABASE PARAMETERS RELATED TO DATABASE BLOCK CORRUPTION
Some of the key database parameters are:
A) DB_BLOCK_CHECKSUM
- Calculates a checksum for each block before it is written to disk, every time.
- Default value is TRUE
- This process can cause 1-2% performance overhead
- System tablespace is always set to true regardless of this parameter, others are only set when you enable them
B) DB_BLOCK_CHECKING
- Help to detect damage introduced by faulty memory.
- Default value is FALSE
- When the DB_BLOCK_CHECKING parameter is set to [TRUE|HIGH] Oracle performs a data check in the block for self-consistency.
- Please note that block checking can add between 1 and 10% overhead to the server so it can be performance intensive.
- Oracle recommend setting this parameter to [TRUE|HIGH] if the overhead is acceptable. Allowable values include [OFF|FALSE], LOW, MEDIUM, [HIGH|TRUE]
C) DB_LOST_WRITE_PROTECT
- Enables or disables lost write detection
- Default value is NONE
- A data block lost write occurs when an I/O subsystem acknowledges the completion of the block write, while in fact the write did not occur in the persistent storage.
- Values can be NONE|TYPICAL|FULL
- Use this parameter only at the time of urgent temporary diagnostics need or else let it stay at NONE (default)
STEPS TO IDENTIFY CORRUPTED BLOCKS
Alert log is important source to capture the database block corruption related information. There are two important Oracle tools to identify corruption:
1) RMAN
2) DBVerify (DBV)
3) LEGACY EXPORT METHOD
4) ANALYZE VALIDATE STRUCTURE
1) RMAN
RMAN can Detect both logical as well as physical corruption.
Normally RMAN checks only for physically corrupt blocks with every backup it takes. RMAN does not automatically detect logical corruption by default. You need to use “CHECK LOGICAL” clause as shown below to tell RMAN to find logical corruption also along with physical corruption.
VALIDATE WHOLE DATABASE FOR CORRUPTION
RMAN> connect target / RMAN> run { CONFIGURE DEFAULT DEVICE TYPE TO DISK; CONFIGURE DEVICE TYPE DISK PARALLELISM 10 BACKUP TYPE TO BACKUPSET; BACKUP VALIDATE CHECK LOGICAL DATABASE FILESPERSET=10; }
Other Examples
ENTITY | CHECK PHYSICAL CORRUPTION | CHECK PHYSICAL + LOGICAL CORRUPTION |
DATA FILE | ||
BACKUPSET | VALIDATE BACKUPSET 10; | VALIDATE CHECK LOGICAL BACKUPSET 10; |
ARCHIVELOGS | VALIDATE ARCHIVELOG ALL; | VALIDATE CHECK LOGICAL ARCHIVELOG ALL; |
TABLESPACE | VALIDATE TABLESPACE USERS; | VALIDATE CHECK LOGICAL TABLESPACE USERS; |
CONTROLFILE | VALIDATE CURRENT CONTROLFILE; | VALIDATE CHECK LOGICAL CURRENT CONTROLFILE; |
FILES TO BACKUP | BACKUP VALIDATE DATABASE ARCHIVELOG ALL; | BACKUP VALIDATE CHECK LOGICAL DATABASE ARCHIVELOG ALL; |
FILES TO RESTORE | RESTORE VALIDATE DATABASE; | RESTORE VALIDATE CHECK LOGICAL DATABASE; |
2) DBVerify (DBV)
Example:
dbv file='+DATAC1/DB_s/datafile/APEX_2150625023474721.7764.933145699' blocksize=8192 dbv FILE=/fk03/DB/oradata/fnd_1.dbf LOGFILE=verify.log BLOCKSIZE=8192 FEEDBACK=100
3) LEGACY EXPORT METHOD
Normal export process will also indicate corruption.
4) ANALYZE VALIDATE STRUCTURE
ANALYZE TABLE tablename VALIDATE STRUCTURE CASCADE;
- To check for table or index corruption, however it will only check for corruption below the high watermark, so will not check unused space.
- It does not identify the corrupt block but gives a ORA-01498 error.
The following statement analyses the EMPLOYEE table:
ANALYZE TABLE EMPLOYEE VALIDATE STRUCTURE;
You can validate an object and all dependent objects (for example, indexes) by including the CASCADE option:
ANALYZE TABLE EMPLOYEE VALIDATE STRUCTURE CASCADE;
By default the CASCADE option performs a complete validation. Because this operation can be resource intensive, you can perform a faster version of the validation by using the FAST clause. This version checks for the existence of corruptions using an optimized check algorithm, but does not report details about the corruption. If the FAST check finds a corruption, you can then use the CASCADE option without the FAST clause to locate it.
ANALYZE TABLE EMPLOYEE VALIDATE STRUCTURE CASCADE FAST;
You can specify that you want to perform structure validation online while DML is occurring against the object being validated. There can be a slight performance impact when validating with ongoing DML affecting the object, but this is offset by the flexibility of being able to perform ANALYZE online. The following statement validates the emp table and all associated indexes online:
ANALYZE TABLE EMPLOYEE VALIDATE STRUCTURE CASCADE ONLINE;
IMPORTANT VIEWS RELATED TO CORRUPTION
V$DATABASE_BLOCK_CORRUPTION
- V$DATABASE_BLOCK_CORRUPTION displays information about database blocks that were corrupted after the last backup or validation.
- Read here about the column definition of this view.
- Corruptions information are recorded in the V$DATABASE_BLOCK_CORRUPTION view. Running above RMAN or DBV commands will populate the V$DATABASE_BLOCK_CORRUPTION view.
SELECT * FROM V$DATABASE_BLOCK_CORRUPTION
- You can find out which database objects are corrupted by using the file#/block# information from above query.
select segment_type,owner'.'segment_name from dba_extents where file_id = [&file_id] and [&block] between block_id and block_id+blocks -1;
V$COPY_CORRUPTION
- This view displays information about datafile copy corruptions from the control file.
- Read here about the column definition of this table.
V$BACKUP_CORRUPTION
- This view displays information about corrupt block ranges in datafile backups from the control file.
- Read here about the column definition of this table.
HIGH LEVEL STEPS TO CORRECT DATA BLOCK CORRUPTION
BLOCKRECOVER Method
With block media recovery, only the blocks actually being recovered are unavailable during the recovery which is really a great thing. Without block media recovery, if even a single block is corrupt, then you must take the datafile offline, restore a backup of the datafile and recover it by applying archive logs.
Note that you need to have Enterprise Edition Database (Not Standard Edition) to be able to use BLOCKRECOVER method.
Recover corrupt blocks in single datafile:
RMAN> blockrecover datafile 15 block 11234;
Recover corrupt blocks in three different datafiles:
RMAN> BLOCKRECOVER DATAFILE 1 BLOCK 10, 20 DATAFILE 2 BLOCK 45, 46, 47,48 DATAFILE 3 BLOCK 23,24;
DBMS_REPAIR Method
Refer:
DBMS_REPAIR SCRIPT (Doc ID 556733.1)
DBMS_REPAIR example (Doc ID 68013.1)
VERY IMPORTANT: DBMS_REPAIR does not recover the blocks , rather it marks the corrupted blocks as fixed so these blocks are skipped in read/write operation. Data in these blocks are lost forever
Active Standby Database Automatic Block Corruption Repair (ABMR)
Automatic Block Media Repair feature of Active standby database is an excellent feature in which data block corruptions on the Primary database side can be repaired by obtaining those blocks from the standby site. This recovery process is done transparently by a background process (ABMR). It can work vice-versa also to repair block corruptions on the Active Standby site by applying right blocks received from the Primary site.
DEMO
Please refer “Case 2: FEW BLOCKS IN DATAFILE GOT CORRUPTED” of the earlier post RMAN: RECOVERY SCENARIOS RELATED TO DATABASE to see how we recovered from block corruption in the data file.
- Oracle Multitenant DB 4 : Parameters/SGA/PGA management in CDB-PDB - July 18, 2020
- Details of running concurrent requests in Oracle EBS - May 26, 2020
- Oracle Multitenant DB 3 : Data Dictionary Architecture in CDB-PDB - March 20, 2020
Except comment section
nowhere mouse works:)
Nice article
[…] 8. Physical and Logical Block Corruption in Oracle Database … […]
This is a very informative article. Thank you so much.
[…] 8. Physical and Logical Block Corruption in Oracle Database … […]