Skip to content

Real Applications Cluster (RAC) FAQs – 2

Q: What is Oracle RAC and what benefits does it offer?

Ans: Oracle RAC (Real Application Clusters) is a cluster database technology that allows multiple nodes to run Oracle Database instances that access a single shared database. The key benefits of Oracle RAC include high availability, scalability, and load balancing.


Q: What are the different components of Oracle RAC architecture?

Ans: The components of Oracle RAC architecture include:

Oracle RAC database: The database instances running on multiple nodes.
Interconnect: The network that connects the nodes and provides communication between them.
Oracle Clusterware: The cluster management software that manages the nodes, interconnect, and resources.
Voting Disk: A shared disk that stores metadata used by the clusterware to determine the health of the nodes.
Shared storage: Disk storage that is accessible by all nodes in the cluster and is used to store the database files.


Q: How does Oracle RAC ensure high availability?

Ans: Oracle RAC provides high availability by having multiple nodes that can take over in the event of a failure of one of the nodes. The clusterware software automatically detects a node failure and redirects database operations to another available node, ensuring that database operations continue without interruption.


Q: Can you explain the process of node eviction in Oracle RAC?

Ans: Node eviction is the process by which the clusterware software removes a node from the cluster. This can occur when a node becomes unavailable due to a hardware or software failure, or when the node is intentionally taken down for maintenance. The clusterware automatically detects the failure and evacuates the database instances on the failed node to other available nodes, ensuring that database operations continue without interruption.


Q: What is the role of Oracle Clusterware in the RAC architecture?

Ans: Oracle Clusterware is the cluster management software that provides the underlying infrastructure for Oracle RAC. It manages the nodes, interconnect, and resources in the cluster and provides services such as node management, resource management, and automatic failover in the event of a node failure.


Q: How does load balancing work in Oracle RAC?

Ans: Oracle RAC provides load balancing by distributing database workloads across multiple nodes in the cluster. This helps to ensure that no single node becomes a bottleneck and that the overall performance of the database remains optimal. The clusterware software monitors the workload on each node and automatically redirects database operations to the least loaded node, providing an optimal distribution of the workload.


Q: How does data consistency work in Oracle RAC?

Ans: Oracle RAC ensures data consistency by using a locking mechanism to coordinate access to data across multiple nodes. When a node modifies data, it acquires a lock on the data, which prevents other nodes from modifying the same data until the lock is released. This ensures that the data remains consistent across all nodes and that there are no conflicts between the nodes.


Q: What is Grid Infrastructure in Oracle RAC?

Ans: Grid Infrastructure is a component of Oracle RAC that provides the underlying infrastructure for the cluster. It includes the Oracle Clusterware software and the shared storage that is used by the nodes in the cluster.


Q: Can you explain the process of failover in Oracle RAC?

Ans: Failover is the process by which the clusterware software automatically redirects database operations from a failed node to another available node in the cluster. This is achieved by monitoring the health of the nodes in the cluster and automatically redirecting database operations to another available node in the event of a failure. The failover process is transparent to the end users, ensuring that database operations continue without interruption.


Q: What is the role of the Global Cache Service in Oracle RAC?

Ans: The Global Cache Service (GCS) is a component of Oracle RAC that provides cache coherency between the nodes in the cluster. When a node modifies data, the GCS updates the data in the cache of all other nodes in the cluster, ensuring that all nodes have access to the most up-to-date data. The GCS also provides locking and concurrency control to ensure data consistency across the nodes.


Q: What is the srvctl command and what is its purpose in Oracle RAC?

Ans: The srvctl command is a command-line tool used to manage Oracle RAC services and resources. The srvctl command can be used to start, stop, and configure Oracle RAC services, manage cluster databases and instances, and manage disk groups and ASM instances.


Q: How do you start and stop an Oracle RAC service using the srvctl command?

Ans: To start an Oracle RAC service, use the following command:

srvctl start service -d <database_name> -s <service_name>

To stop an Oracle RAC service, use the following command:

srvctl stop service -d <database_name> -s <service_name>


Q: How do you manage an Oracle RAC instance using the srvctl command?

Ans: To add an Oracle RAC instance, use the following command:

srvctl add instance -d <database_name> -i <instance_name> -n <node_name>

To remove an Oracle RAC instance, use the following command:

srvctl remove instance -d <database_name> -i <instance_name>


Q: How do you manage disk groups in Oracle RAC using the srvctl command?

Ans: To add a disk group to Oracle RAC, use the following command:

srvctl add diskgroup -d <diskgroup_name>

To remove a disk group from Oracle RAC, use the following command:

srvctl remove diskgroup -d <diskgroup_name>


Q: What is the crsctl command and what is its purpose in Oracle RAC?

Ans: The crsctl command is a command-line tool used to manage the Oracle Clusterware software in Oracle RAC. The crsctl command can be used to start, stop, and configure the clusterware components, check the status of the cluster, and manage resources and dependencies.


Q: How do you check the status of the Oracle Clusterware using the crsctl command?

Ans: To check the status of the Oracle Clusterware, use the following command:

crsctl check crs


Q: How do you manage resources in Oracle RAC using the crsctl command?

Ans: To add a resource to Oracle RAC, use the following command:

crsctl add resource <resource_name>

To delete a resource from Oracle RAC, use the following command:

crsctl delete resource <resource_name>


Q: Explain how you would diagnose a slow performance issue in an Oracle RAC environment.

Ans: To diagnose a slow performance issue in an Oracle RAC environment, the following steps can be taken:
Check the wait events and performance metrics of each node in the cluster to identify which node is experiencing the performance issue.
Verify that the cluster and interconnect are properly configured and functioning, and that the nodes are communicating effectively.
Check the database load and resource utilization, including CPU, memory, disk I/O, and network I/O.
Examine the database SQL and query plans to identify any inefficient or resource-intensive SQL statements.
Review the database schema, indexing, and statistics to ensure that the database is optimized for performance.
Monitor the database and system logs for any errors or warning messages that may be related to the performance issue.


Q: In a two-node RAC configuration, the primary node experiences a complete failure. How would you recover the database on the secondary node?

Ans: To recover the database on the secondary node in a two-node RAC configuration, the following steps can be taken:
Stop the Oracle Clusterware and the database instances on the secondary node.
Start the database in exclusive mode on the secondary node.
Mount the database and recover it using the latest available backup and archive logs.
Open the database with the resetlogs option.
Start the Oracle Clusterware on the secondary node.
Manually register the database with the cluster using the srvctl command.
Start the database instances on the secondary node.


Q: In a four-node RAC configuration, two nodes experience complete failures. How would you recover the database in this scenario?

Ans: To recover the database in a four-node RAC configuration where two nodes experience complete failures, the following steps can be taken:
Stop the Oracle Clusterware and the database instances on the remaining two nodes.
Start the database in exclusive mode on one of the remaining nodes.
Mount the database and recover it using the latest available backup and archive logs.
Open the database with the resetlogs option.
Start the Oracle Clusterware on the remaining two nodes.
Manually register the database with the cluster using the srvctl command.
Start the database instances on the remaining two nodes.
Re-create the missing nodes in the cluster.
Add the newly created nodes to the cluster.
Start the database instances on the newly created nodes.
Reconfigure the load balancing and failover options for the database and instances.


Q: Explain how you would perform an in-place upgrade of an Oracle RAC 19c database.

Ans: To perform an in-place upgrade of an Oracle RAC 19c database, the following steps can be taken:
Verify that all nodes in the cluster are running the same version of the database and software.
Stop the Oracle Clusterware and the database instances on all nodes in the cluster.
Backup the entire database and archive logs.
Perform the upgrade using the Oracle Database Upgrade Assistant (DBUA) or manual upgrade procedures.
Start the Oracle Clusterware and the database instances on all nodes in the cluster.
Verify the database and software version on all nodes.
Run the Post-Upgrade Status Tool to check the database for any issues or inconsistencies.


Q: In a six-node RAC configuration, two nodes are experiencing slow performance and are affecting the overall performance of the database. How would you resolve this issue?

Ans: To resolve a slow performance issue in a six-node RAC configuration, the following steps can be taken:
Check the wait events and performance metrics of the two affected nodes to identify the root cause of the performance issue.
Verify that the cluster and interconnect are properly configured and functioning, and that the nodes are communicating effectively.
Check the database load and resource utilization, including CPU, memory, disk I/O, and network I/O.
Examine the database SQL and query plans to identify any inefficient or resource-intensive SQL statements.
Review the database schema, indexing, and statistics to ensure that the database is optimized for performance.
Monitor the database and system logs for any errors or warning messages that may be related to the performance issue.
If necessary, increase the resources (e.g. CPU, memory, disk, network) on the affected nodes.
Consider adjusting the load balancing and failover options for the database and instances to balance the load across the nodes.
If the issue persists, consider moving the affected database instances to different nodes in the cluster.


Q: In a five-node RAC configuration, one node experiences a disk failure, and the disk cannot be recovered. How would you recover the data on this node?

Ans: To recover the data on a node with a disk failure in a five-node RAC configuration, the following steps can be taken:
Stop the Oracle Clusterware and the database instances on the affected node.
Restore the data from the most recent backup to the affected node.
Start the Oracle Clusterware and the database instances on the affected node.
Re-create the missing data on the affected node using the data from the other nodes in the cluster.
Re-create the missing disk on the affected node.
Reconfigure the load balancing and failover options for the database and instances.
Monitor the database and system logs to ensure that the recovery was successful and that the data is consistent across all nodes in the cluster.


Q: How would you troubleshoot a “Cluster Synchronization Service (CSS) Communication Failure” error in Oracle RAC 19c?

Ans: A “Cluster Synchronization Service (CSS) Communication Failure” error occurs when there is a failure in communication between the nodes in the cluster. To troubleshoot this error, the following steps can be taken:
Verify that the network connections between the nodes are active and stable.
Check the firewall settings and make sure that the necessary ports are open.
Check the node status using the crsctl command and make sure that all nodes are up and running.
Check the disk space on all nodes to make sure that there is enough disk space for the Oracle Clusterware.
Check the cluster and interconnect configurations, and make sure that they are consistent across all nodes in the cluster.
If the issue persists, restart the Oracle Clusterware and the database instances on all nodes in the cluster.


Q: How would you troubleshoot an “ORA-15032: not all alterations performed” error in Oracle RAC 19c?

Ans: An “ORA-15032: not all alterations performed” error occurs when there is a failure to alter the disk group in a RAC environment. To troubleshoot this error, the following steps can be taken:
Verify that the disk group is accessible and that the disk is online and healthy.
Check the disk space on all nodes to make sure that there is enough disk space for the disk group.
Check the cluster and interconnect configurations, and make sure that they are consistent across all nodes in the cluster.
Check the database load and resource utilization, including CPU, memory, disk I/O, and network I/O.
Monitor the database and system logs for any errors or warning messages that may be related to the disk group.
If the issue persists, restart the Oracle Clusterware and the database instances on all nodes in the cluster.


Q: How would you troubleshoot an “ORA-01578: ORACLE data block corrupted” error in Oracle RAC 19c?

Ans: An “ORA-01578: ORACLE data block corrupted” error occurs when a data block in the database is corrupted. To troubleshoot this error, the following steps can be taken:
Check the database load and resource utilization, including CPU, memory, disk I/O, and network I/O.
Check the disk space and disk health on all nodes in the cluster.
Check the database logs for any errors or warning messages that may be related to the data block corruption.
Verify that the database and software version are consistent across all nodes in the cluster.
If the issue persists, restore the data from the most recent backup to the affected node.
Monitor the database and system logs to ensure that the recovery was successful and that the data is consistent across all nodes in the cluster.


Q: How would you troubleshoot an “ORA-01102: cannot mount database in EXCLUSIVE mode” error in Oracle RAC 19c?

Ans: An “ORA-01102: cannot mount database in EXCLUSIVE mode” error occurs when the database cannot be mounted in exclusive mode. To troubleshoot this error, the following steps can be taken:
Check the database logs for any errors or warning messages that may be related to the issue.
Verify that all nodes in the cluster have consistent database and software versions.
Check the network connections between the nodes and make sure that they are stable and active.
Check the disk space on all nodes to make sure that there is enough disk space for the database.
Check the database and cluster configurations, and make sure that they are consistent across all nodes in the cluster.
If the issue persists, restart the database instances on all nodes in the cluster and try mounting the database again.


Q: How would you troubleshoot an “ORA-01555: snapshot too old” error in Oracle RAC 19c?

Ans: An “ORA-01555: snapshot too old” error occurs when the undo data is not available for a long running transaction. To troubleshoot this error, the following steps can be taken:
Check the database load and resource utilization, including CPU, memory, disk I/O, and network I/O.
Check the undo tablespace size and monitor it for any growth or shrinkage.
Check the database logs for any errors or warning messages that may be related to the issue.
If the undo tablespace is too small, increase its size to accommodate the long-running transactions.
Monitor the database and system logs to ensure that the issue does not persist.
If the issue persists, restart the database instances on all nodes in the cluster.


Q: How would you troubleshoot an “ORA-12547: TNS:lost contact” error in Oracle RAC 19c?

Ans: An “ORA-12547: TNS:lost contact” error occurs when the connection between the client and the database is lost. To troubleshoot this error, the following steps can be taken:
Check the network connections between the client and the nodes in the cluster and make sure that they are stable and active.
Check the firewall settings and make sure that the necessary ports are open.
Check the listener logs for any errors or warning messages that may be related to the issue.
Verify that the tnsnames.ora file is configured correctly and that it has the correct entry for the database.
Restart the listener on all nodes in the cluster.
If the issue persists, restart the database instances on all nodes in the cluster.


Q: How would you troubleshoot an “ORA-15077: could not locate ASM disk” error in Oracle RAC 19c?

Ans: An “ORA-15077: could not locate ASM disk” error occurs when the ASM disk group is not accessible. To troubleshoot this error, the following steps can be taken:
Check the ASM disk group configuration and verify that the disk group exists and that it is accessible from all nodes in the cluster.
Check the disk group status and make sure that it is online and available.
Check the ASM instance logs for any errors or warning messages that may be related to the issue.
Check the disk I/O and disk space utilization, and make sure that there is enough disk space available for the disk group.
If the disk group is offline, bring it online using the ASM command “ALTER DISKGROUP <diskgroup_name> MOUNT”.
If the issue persists, restart the ASM instance on all nodes in the cluster.


Q: How would you troubleshoot an “ORA-16014: log mining of dictionary scoped Undo is not supported” error in Oracle RAC 19c?

Ans: An “ORA-16014: log mining of dictionary scoped Undo is not supported” error occurs when the log mining feature is not supported for dictionary scoped undo. To troubleshoot this error, the following steps can be taken:
Check the database version and make sure that it is supported for the log mining feature.
Check the database configuration and verify that the log mining feature is enabled.
Check the database logs for any errors or warning messages that may be related to the issue.
If the issue persists, restart the database instances on all nodes in the cluster.
If the database version is not supported for the log mining feature, consider upgrading to a supported version.


Q: How would you troubleshoot an “ORA-07445: exception encountered: core dump [kgeade()+64]” error in Oracle RAC 19c?

Ans: An “ORA-07445: exception encountered: core dump [kgeade()+64]” error occurs when there is a system or database error. To troubleshoot this error, the following steps can be taken:
Check the operating system logs for any errors or warning messages that may be related to the issue.
Check the database logs for any errors or warning messages that may be related to the issue.
Check the database load and resource utilization, including CPU, memory, disk I/O, and network I/O.
Check the database and system configurations, and make sure that they are consistent across all nodes in the cluster.
If the issue persists, restart the database instances on all nodes in the cluster.
If the issue continues, consider opening a service request with Oracle Support for further assistance.

Brijesh Gogia
Leave a Reply